It’s been a while since I’ve listed any activity, so here’s what’s been up:
Twitter Bot: I took him offline a good while ago because as the source file grew, the time to create a distweet from the source also grew. It was taking an absurd amount of time, mostly because the source grew quite quickly. So I weighed my options:
1. Manually wipe all or part of the source file – While fast and easy, it’s also manual and will need done repeatedly. Frankly, I don’t want to manually do anything. Hence writing greasemonkey scripts to play facebook games for me. This obviously leads to (2).
2: Write code to wipe all or part of the source file – Fairly tempting and easy-as-pi, but seems like a kludge none-the-less. A very appealing kludge actually. To be completely honest, fairly far into my commitment towards my chosen path, it’s still nagging me that I should at least do it and put my bots back online until I get the motivation to fix my bots on the chosen route.
3: Ignore the problem all together – Perhaps this really should be the first option I would think of, but actually, I just now realized this as an option. I would say this is an example of how my mind works (forgetting the most obvious of available options), but that’s not entirely true. In most situations, I try to consider and weigh all options, then berate anyone near me that chose the wrong one. How I maintain a girlfriend is quite a mystery based on this alone. When programming, however, this is generally how I am and rightfully so. It’s the easy way out. It’s writing databases in first normal form. F that S.
4: Throw another buzzword at the problem – This was my chosen route. I decided to fix the source cancer[1] and markov chain generation lag problem with an SQL database.
SQLite was an obvious choice for this, and after a day of fighting installing the gem, I realized I was an idiot and was not doing it wrong, but was trying to use the wrong version. It wasn’t well documented, but I’ll take the blame.
Now that I could use SQLite, I had a few problems ahead of me. I took a database class several semesters back and learned all the proper SQL uses. I then immediately followed it with an internship and had to actively ignore half of the laws of SQL I learned.
Foreign keys? Pfff. Constraints to enforce them would break the nature of the system. Technically they exists, but there’s nothing saying this is the foreign key.
Primary Keys? For newer tables: generally okay. For legacy (read: core) tables: Sort the candidate keys by most-likely-to-change and then pick the only one that the value for it must be able to change because of business requirements. Use this field as the “foreign key” into other tables as well.
(small ex-work rant)Specifically, I'm saying: Ignore the integer autoincrement not null field named "key" and instead use the client's SSN as the primary key for the client table and as the foreign key for it in every other table that refers to a client (easily 95% of all tables). So if a client's SSN changes (a rare but possible event, and a business requirement), you not only update the record in the client table, but every record for the client in 95% of all the tables in the database. To do this, someone has to go through all the tables, identify the column corresponding to the SSN (5 major variations to what this field may be called), and record the table and column in another table, so a program can query this table and update all the fields. The main problem with this is that the number of tables grew crazily fast, so maintenance is always an issue, and often overlooked through test and deployment.
My second problem to overcome with this is that it’s an an SQL variant that I hadn’t used before. I’ve mostly used MS SQL, which I’ve realized to be quite user friendly, and Oracle, which I realized to be quite user unfriendly. I’ve mostly forgotten the quirks[2] to Oracle that I learned in the trenches fighting with it, kabar in hand.
(another ex-work rant) Mini-Quiz: When you tell your client (not the client's clients that the database tables refer to in previous rant) that your application will work with any SQL DBMS and they require Oracle, the best plan is to:
a. immediately set up oracle in-house and do development and test using Oracle from the start
b. work with MS SQL until two weeks before the deploy-to-client deadline then switch to Oracle and see what doesn't work
c. ignore the requirement for Oracle and try to claim that Oracle was never agreed upon.
If you said, B, you may have been a project leader at my ex-work. I'll give you a little insight into how well that went. When switching the database backend from MS SQL to Oracle, nothing will work. Nothing. And two weeks is not enough time to make it all work. Two months is not enough time. You can get most converted in those two months, but you'll have to abandon starting work on other parts of the application and abandon working on other clients applications, for this to happen. You'll also drive everyone insane as they're simultaneously trying to learn Oracle SQL and trying to learn Oracle's god awful frontend.
(small Oracle rant) The java GUI frontend to Oracle is awful. God awful. It's more grotesque than two-girls-one-cup. The error messages were also extremely unhelpful. They almost always claimed the error was something that it actually wasn't. The GUI provided two ways to run queries. The obvious and GUI way would give you these errors but not the location of the error. The unobvious pseudo-CLI way would at least give you the location of the error. God that GUI was terrible. I really hope it's improved by now. Or that everyone use the CLI, because CLI users are not expecting anything eye-pleasing.
(small Work/Oracle rant) One particularly huge problem I ran into during the conversion was to translate a vital query that created the information of a mandatory government report. The query was huge. It was also legacy. When I asked the resident government-regulations/sql-abuse/client-BSing guru about the query, as he had formerly been in charge of that query, he handed me a packet explaining the requirements and specific formats the data must be in and told me that he'd inherited the query from an early client that had written it in access.
I spent weeks trying to decode the query and then determine what parts would work in both MS SQL and Oracle SQL and what parts would only work in MS SQL and would therefore have to separated so the application would choose which queries to use based on its database backend.
The massive query had nested sub select statements 3-5 deep. I wish that was an exaggeration. I wish that last sentence wasn't cliche. The nested sub selects used WHERE clauses that matched to values on the outermost query. MS SQL handled this fine. Oracle could only handle this 1 sub select in. I had actually discovered a way to phrase the select and sub select statements in a way that allowed me to get another level deeper, but by the time I would have gotten 5 deep, it would have been a complicated mess. Probably inefficient too. I didn't bother testing.
Back to my twitter bots and my choice to complicate it with a database. I chose a database as it would allow me to rebuild markov chains for a particular source text from the database instead of redoing it manually. Theoretically it should be faster, especially for a large source text (a character dispress of the complete works of Shakespeare, for example). Databases require a proper design, and a proper design requires careful consideration into proper normalization, and remembering all the fields. It also requires consideration into how the design is going to effect the queries needed to insert and update, and retrieve the information. Over complication could kill the speedup I want, and make my code even more confusing. Unfortunately, a full class schedule plus work consume a lot of time and energy. Luckily, Financial Management allowed a lot of time to doodle database diagrams. It also gave me time to develop an algorithm for measuring a dispress, but I’m not quite ready to fully discuss it, partially because I came up with it pre-finals and now I can’t remember the details off hand and so I need to translate my notes and formulas for the algorithm from my hand writing to proper english.
I recently pulled out my diagrams and revised them and build a test database, but unfortunately to test the database, I really need to convert my bot. Every time I start, I stare at the code, scratch my head, and curse myself for not writing books of comments for each part of code and quite and instead write pseudocode for when I finally get my hands bloody.
After writing all this, auto-kludging my twitter bot seems like a great idea at the moment until I do a real conversion.
Other Diversions: Much like my game-playing greasemonkey scripts, I have also been working on a Ruby bot for an other game I play and a couple different bot programs using it. I’ve been using Mechanize for my bot, and I really love it. I should also say that I really dislike web programming. Writing both server side and client side code is treacherous. Security, session, and timeout issues are always looming over every statement you write.
I’ve abandoned my facebook-game scripts. I went on the vacation over the summer and came back with no urge to “play” them and thus haven’t looked at them in ages. If you’ve come looking for support, I’m sorry. Hopefully someone stepped up and kifed my scripts and kept the spirit alive. Do give credit at least so they know to ban my account from their games too when they finally get smart to my tricks.
I’m gonna end this here. I’ve drank 2+ litres of coke since ~4pm yesterday and it’s 2:46am now and I’m not tired, which is a bad sign. Late night + coke leads the the sort of ramblings you’ll find above. I really do think I’m gonna kludge the source. I don’t have the attention span at this time to proof read for coherence and grammar. Please forgive these errors.
Notes:
[1] Source cancer: uncontrolled growth of the source.
[2]Some of what might be considered a quirk from the MS SQL to Oracle POV may be considered the lack of a coddling feature from the Oracle to MS SQL POV.
[update]
I did implement the automatic source trimming solution and it works surprisingly better than expected. I had no idea how well Ruby, or my old, dieing desktop I use as a personal server would handle reading in thousands of lines from a text file, storing them in an array and then sorting that array (randomly of course, as not to be ageist). It handles it well, and does it faster than it took to scan the command prompt to find where it started the trimming process. An other time Ruby has completely amazed me. I still plan to implement the SQL solution, and I’m still working on the measurement algorithm.
With that done, you’ll notice that on the right side of the home page you can read the sometimes-coherent ramblings generated by a man that learned to speak by reading twitter, who I named Desmond Keller; Desmond after Desmond Dekker and Keller, as best I can figure, Hellen Keller. Feel free to follow him, but be warned:
(1)He’s completely unmoderated, with the exception of the most basic attempts to remove @ replies and urls (though they may slip through as it’s not foolproof, and i’d rather have false negatives than false positives) to try to prevent bothering others and inadvertently passing along possible spam and pinup girls.
(2)He updates frequently. Specifically, every 20-30 minutes, as in48-72 updates a day, assuming no errors and no ‘server’ downtime. I follow a manic webcomic artist that updates less than that, so putting him in a special Tweetdeck group is a suggestion.
Secondly, I put my tangental rants in code blocks so they’re visually different than the rest of the post so they’re easier to skip if you desire more coherence and cohesiveness in the post. I wish wordpress had collapsible “spoiler” boxes that I could have hid the rants in.