My process for "Thinko" started as I tried to get as much information as possible from Windows XP. Windows handles its inter-process communication relatively insecurely, and I thought it would be fun not only to capture "native" Windows information, but visualize them in a manner that removed them entirely from their original context.

This visualization of this data would hopefully give some insight into how our computers behave when they're "sleeping" (when we're not actively using them). Most people (myself included), don't actively think about what computers are doing when they're not being "used". In most cases they don't simply "sleep" but instead index our data, perform routine operations, etc. Seeing and comparing this data I hoped would give some insight into not only what computers do when we leave them alone, but how trends in our computer use can be gleaned from our computer's at-rest state.

My first tests were centered around getting information on the running processes (total number, names, handles, memory usage, etc). I quickly ran up against a wall. The name of that wall was the Win32 API. The Python bindings for the Win32 API are included with a module called Python For Windows and more specifically the CON/COM modules. At this point I was working through multiple layers of abstraction to get process information, making it a relatively slow process (for real-time rendering) as well as creating an environment in which I had very little control over the formatting of the data. These two problems seemed to be insurmountable as far as completing the project in the time allotted was concerned.

I was committed to using the WIN32 API at this point, however, so I rooted through it to see what other data I could capture. In the API I found a relatively simple method to capture information that had been copied to the Windows clipboard. In fact it was robust enough to allow distinguishing between different types of data (file, image, text).

Armed with my new implementation of the WIN32 API I went to work developing a system whereby I would be able to capture clipboard data in (semi) real-time from multiple sources and display them through a single visualization. To accomplish this, in addition to the WIN32 API, I used ftpLib to transfer files and OpenGL/GLUT for the visualization.

The final project took the form of two distinct programs; "CLIP", which captured the clipboard and "THINKO", which visualized the data that the CLIP clients had stored on my server. This allowed me to compile the CLIP down to an .EXE (plus support files) using py2exe , which I then bundled up with the Inno Setup creating a very portable package (that didn't require users to have Python development tools on their machine). The CLIP program runs in its own window via TKinter . It runs passively and without any intervention from the users. It does, however show what is being uploaded to the server both for debugging purposes as well as giving the user some level of understanding of what's going on while I harvest their data (it would seem rather disingenuous otherwise).

You will notice from browsing through CLIP code that there is very little in the way of security. It is very possible for users to "game the system" as it were. In fact the FTP login/password is embedded within every CLIP client. It's seemed ridiculous to me, however, having designed a platform for stealing data from users to lock it down from any prying. The only precautions I put in were to help the visualization run as smoothly as possible. For example, CLIP creates a temp file, and every time it checks the clipboard data, it also checks its temp file to make sure that it's not continually uploading the same, static data. Also, all non-alphanumeric characters are stripped from the clipboard data to prevent the GLUT font rendering from choking on strange characters (as well as to make it more aesthetically pleasing).

I had quite a bit of trouble with my visualization, because I wanted to provide the viewer with a sense of "scale" when viewing the data. Ultimately I decided upon creating a 'cityscape" of text strings rising from the ground. The user is able to navigate the environment by both zooming in as well as moving vertically while all of the text slowly orbits around the center. Some of the strings were in fact exceptionally long, while others were a single word. Also, I categorized the strings not by length, but by when they were collected, with the oldest being the center and new strings being added to the periphery.

There are no tags attached to any of the text to denote where or who they came from. I felt that people would be far less interested in participating if they knew that their information wasn't being collected in an at least semi-anonymous fashion.

I feel that in the end the project was a success. I met my personal goals and learned quite a bit about Python, OpenGL/GLUT, the WIN32 API and other assorted tools (py2exe, Inno Setup, TKinter, among others).

Screenshots:




Code:
clip.py (clipboard capture engine)
standalone.py (TKinter implimentation)
thinko.py (visualization)

Media:
ScreenShot 1
ScreenShot 2
ScreenShot 3
Video

Etc:
clipinstaller.exe (installer)
card (instructional card)