Second Idea: A more reasonable approach to visualizing the users web browsing session would be through a tree. A tree structure allows for a clear representation similar to web graph, but the initial computation to generate children of a node seems far more straight forward. As long as each node keeps track of how many children it has, centering these children in a straight row below their parent node requires multiplying the set size of each node (relative to pixels) by the number of nodes, with a few pixels keeping the nodes from touching.
My idea for the structure:
Each Node object would have five attributes: Its URL, Date/Time of initial visit (this session), an Array of other Node objects, a count of how many children this node has, and a Depth attribute to show its level in the tree.
URL: Straight forward, this is the most important bit of information to keep track of.
Date/Time: By giving a Date/Time attribute to each node I can give the ordering of a Node's children some significance, for example the ordering of children from left to right could represent the earliest visit to the most recent visit.
Array: For the recursive building of the tree.
Number of Children: This number is required to make computing the pixel real estate needed an easier processes, as briefly outlined above.
Depth: The root would be considered depth 0. This property is a late edition that I thought of last night to try to solve the problem of overlapping. Even though I can easily compute where children will be located I still had no way to prevent children from overlapping. However, if I can search for each node at a certain depth I can list them side by side such that none overlap - and spread them out in an even manner.
Doing this implementation I would have to have a set number of pixels to consider the maximum width. This wouldn't be all that hard, but I would then need to consider what would be a good bound on the maximum expected number of Nodes at each level, and make sure that my max pixel-to-Node-size-ratio is able to accommodate this.
Some Potential Problems:
1) Currently I have no way to deal with a cycle in the tree. A possible solution would be while building the graph flag a Node if that Node has already been entered into the tree but is a child of another Node - instead of adding it to the tree again or creating a path back to that Node which would most likely overlap other Nodes the next Node in the tree could be a Node with no children and arbitrary properties that says "See Node _____". But this creates a loss of relevance in what the tree is trying to convey (see next problem).
2) How can I distinguish Site A being reached from Site B from, say, being at Site A then typing a completely new website into the search bar - Google, for example. In my structure there would be a path from Site A to Google even though there was no direct link following. This could potentially be a problem, but it could also be the right thing to do - the Google query could be provoked by the content of Site A and therefore there really should be a link between them. Would this be a good design choice? The flaw is that in most cases Google (or the users preferred search engine) would be visited multiple times per session; if I use the possible solution to problem 1 then all relevance between Site A and the result of the Google query would be lost.
Closer look at WebReview and thoughts of it:
What WebReview is missing is an annotation feature. I feel I could implement this reasonably well, except to maximize its usefulness a Session and its resulting graph would have to have the ability to be saved. Saving into a html document as WebReview does is a complicated process, so instead an option I could pursue would be to save just the Nodes and their properties into a CSV file. Loading would only require parsing the file and rebuilding.
I found this great site that lists some projects that were being developed about a decade ago, similar to what I am attempting. What I envision is something similar to the SurfSerf application listed there (but less 90's Geocities-ish, I hope). Unfortunately the site is no longer maintained and none of the links work - but the pictures are representative enough.
To Accomplish Today:
- Create a test extension that adds URLs to an array, with a button to open a window listing each of the stored URLs.
- Maybe learn some sqlite.