May 28th, 2009 / No Comments » / by Chris
Found a cool site called Project Euler
From their description:
Project Euler is a series of challenging mathematical/computer programming problems that will require more than just mathematical insights to solve. Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required to solve most problems.
I’ve done a few of the first problems, implemented them all in ruby so far (or just did the math required), and they’ve definitely been interesting and fun to work on. 4 down, 242 to go. All of the problems are math related, and you are basically presented with the problem and a field to input your answer. After you supply the correct answer you’re given a link to the forum (most of which are locked now) where people have discussed their approaches to solving the problem.
If you’re looking to brush up on some math skills, or just need some interesting problems to throw some code at, this site provides some great things to work on.
Posted in: Data and Algorithms, Learning, Programming Problems
Tags: Math, project euler
May 27th, 2009 / No Comments » / by Chris
Stumbled across this site: Programmer Competency Matrix
It’s a pretty good summary of the different levels of programmer competence in a lot of areas. Mostly I’m finding it useful in seeing where I’m at and getting suggestions on what to look into next in an effort to become better at programming and software development.
Posted in: Learning, Misc
Tags: Learning, programming
May 20th, 2009 / No Comments » / by Chris
After trying to use my previous map implementation in a Firefox extension I’ve been working on I realized that you can’t add random attributes to some objects you deal with in that world. This made my previous approach unworkable. After doing some more searching, I found an approach that seems to work well so far.
The basic idea is that we want to be able to generate an identifier for a random object consistently, then we can associate this identifier with a value. The limitation is that the identifier needs to be a string (limitation of Javascript’s internal associative array/object construct), and that native javascript objects don’t have an internal mechanism for generating an ID (unlike Java’s Object class, which provides a hashCode() method).
The process to solve this problem has two steps in my current implementation:
- Store the ‘key’ object in an array
- Use the ‘key’ object’s array index as the key value for the association
This way, I can tell if a passed in ‘key’ object has a value in my map by seeing if it’s in my internal array. If it’s in there, I can look up it’s index’s value in my associative array (or anonymous object). If I’m adding a new key/value pair, I insert the key object into my internal array and then use it’s index as the ‘key’ for my associative array. Removing a key/value pair from the map involves the following steps:
- Look up the index of the key object in my internal array
- Remove the value that corresponds to the index from the associative array/anonymous object
- Remove the key object from the internal array
Here’s an example of the code that inserts a key/value pair into the map, this is a snip from the code, which is more object-oriented:
// Internal holders
var _internalMap = {};
var _internalArray = new Array();
/*
* Private function for getting the key for an object
*/
function getKey(obj) {
var key = _internalArray.indexOf(obj);
if(key == -1) {
_internalArray.push(obj);
key = _internalArray.indexOf(obj);
}
// "Keys" will be of the form "OR.x" where 'x' is the index in the array
return "OR." + key;
};
/*
* Internal put method
*/
put = function(key, value) {
var objKey = getKey(key);
_internalMap[objKey] = value;
};
The full implementation has get, remove, and size methods as well, and is built as a class, but this kind of shows the mechanism behind it all. One thing to note is that this implementation relies on Javascript 1.6 (or higher) as the ‘indexOf()’ array method was introduced in Javascript 1.6
I’ve attached the new version (version 0.2), you can find it here: JsMap Version 0.2
Posted in: Learning, Programming Problems
Tags: javascript, map
May 18th, 2009 / No Comments » / by Chris
Threading is a topic that I feel gets asked about a lot in interviews, but in certain practices doesn’t get used very much, so it’s easy to forget the details. I stumbled across this really quick java threading/concurrency review that didn’t really teach me anything new, but I still thought it was a good reminder:
Java Concurrency / Multithreading - Tutorial
Posted in: Programming Problems, java
Tags: concurrency, java, quick, threads
May 13th, 2009 / No Comments » / by Chris
Small update, the recursion was killing me, so I introduced some attempts to control how deeply the code recurses. My approach was basically to introduce a depth count, and if the depth count reaches 0, I force a storage growth on the next rehash, thus reducing the number of collisions by increasing the storage size.
This approach is working pretty well, especially if the size of the storage is set to a larger size to start with. The insertions and hashing and lookups are fast, it’s the rehashing and collisions that cause lots of problems.
Cuckoo Hash Part 2 - Second ‘version’ of my cuckoo hash implementation, including the changes above. It now uses a list of 499 commonly used english words, and inserts 6 permutations of the list (pattern: “word-i“, where i is 1-6), for a total insertion count of 2994.
Posted in: Data and Algorithms, Learning, Programming Problems
Tags: cuckoo hash, ruby
May 12th, 2009 / No Comments » / by Chris
From Wikipedia:
Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table
Very basically, a Cuckoo hash is a table, where the position of an element can be one of two places. On insertion, if there is another element (call it ‘y’) in the candidate spot, it is replaced by the item you are inserting. Then you re-add the ‘y’ element to the hash, using the second hash method. Doing this could cause another element (call it ‘z’) to be displaced, in which case the cycle repeats until all elements are in their new spots.
To look up if an element is in the table, you need to check both possible spots.
You can run into a problem with cycles when inserting values if, for example, h1(x) = h2(y) and h2(x) = h1(y), but you can limit the number of re-insertions to avoid this. If you reach this limit, you can change your hashing methods and re-insert everything again. This problem happens even more frequently if the load factor on your table is too high, so you can put checks in place to increase the size of the table if you detect that the load is getting high. In my example, I set the top table load factor to 60%, at which point I double the size of the table and re-insert the elements.
The main problem I’m running into is that the insert method can call the rehash method if it loops too many times trying to place everything…but rehash calls insert to rebuild the table with new hashing methods. This seems to lead to deeply recursive problems, even with a load set to 50%. I’m not sure how to fix this right now, but I’ll keep working at it and take any suggestions. One method might be to find better hashing algorithms, but I thought the one I was using was descent.
Anyway, since I’m still learning about it, here’s a few links to read:
Posted in: Data and Algorithms, Learning, Programming Problems
Tags: cuckoo hash, hashing, ruby
May 11th, 2009 / No Comments » / by Chris
One of the downsides to javascript (at least in my opinion) has been it’s lack of more intelligent collection classes, such as the Map. Javascript’s array and object constructs are very flexible and useful in their own right, but sometimes you just need a way to map one object (as a key) to some arbitrary value, and then have the possibility to look up that value later. I was playing around with some ideas for how to get a simple map to work in javascript and came up with my JsMap class. It’s a very small class, more of an exercise than anything, but maybe someone will find it useful.
This map class is very limited, as (as far as I can tell) javascript objects don’t have internal hashing methods or object ids that allow for simple identification in the context of a map. In order to make up for this, my map class generates a random ID for a passed in “key” object and then uses that random ID as the key in the internal map. The generated key is then stored in a variable __jsKey in the passed in “key” object. When the “get” method on the map is called, it looks to see if the passed in “key” object has a __jsKey variable, and if so, looks up the value associated with it in the internal map.
Here is a link to the attachment page containing the javascript file, which I’ll update with changes as they’re made: [JsMap].
Here’s a simple HTML test of the map class: [JsMap Test/Example usage file]
Posted in: Programming Problems
Tags: javascript, map
May 5th, 2009 / No Comments » / by Chris
Skip lists are a variant of regular linked links that can allow for O(log n) search time. According to nist.gov’s Dictionary of Algorithms and Data Structures entry for Skip Lists, they are:
A randomized variant of an ordered linked list with additional, parallel lists. Parallel lists at higher levels skip geometrically more items. Searching begins at the highest level, to quickly get to the right part of the list, then uses progressively lower level lists. A new item is added by randomly selecting a level, then inserting it in order in the lists for that and all lower levels. With enough levels, searching is O(log n).
At a basic level, searching a skip list consists of starting at element i (where i = 0 to start) in level j in the list (where j is the highest level on the first iteration) and searching for your item. If you encouter a value that is greater than the value you are looking for, you move back to element i-1 in level j-1 and continue searching level j - 1. You continue this pattern until you find your element.
This process allows you to skip sections of the list that won’t contain your element (since skip lists are always sorted). Here’s a visual example:
1---------------5---------------*
1-------3-------5-------7-------*
1---2---3---4---5---6---7---8---*
Given this example, if you were searching for 8 you would only evaluate 4 elements, rather than 8 in a normal linked list. The sequence of evalution would go as follows:
1 -> 5 -> 7 -> 8
I won’t go into a lot of detail on the implementation of a skip list, but I’ll give a quick rundown of the basic and provide a couple links I found helpful. Read more…
Posted in: Data and Algorithms
Tags: algorithms, lists, ruby, skip list
May 3rd, 2009 / No Comments » / by Chris
Bloom filters are a mechanism for “indexing” a set and quickly determining whether or not a given element is part of the set that the Bloom Filter represents. You do not actually store items in a bloom filter, the bloom filter itself is simply an array of bits. The (very) basic idea is that for every element i in the set, you generate a series of array indexes (usually via some hashing method), and then set those positions in your bit array to 1.
For example, for the input “foo” you could generate the indices 1, 4 and 6. Then in your (fixed size) array, you set positions 1, 4 and 6 to “1″. For the input “bar” you generate the indices 1, 4, and 8, and set those positions to “1″. When you want to find out if an input is in the set, you generate the index values (via your hashing method) and check if those positions are set to 1 in the array. If all of them are, the input was likely in the set. If one or more of the indices were not set to 1, then the input was definitely not in the set. You can vary the number of index positions your hashing function provides and the size of your bit-array in order to reduce the number of false positives.
After your bit-array is built, representing your set, it becomes a constant time operation to check if something is in the set. Insertion of new items into the Bloom Filter is also a constant time operation. One major downside to Bloom Filters that I can see is that there isn’t a way to remove something from the bloom filter without causing false negatives.
The two articles I found useful in learning about Bloom Filters were:
Posted in: Learning, Programming Problems
Tags: bloom filter, data structures
May 1st, 2009 / No Comments » / by Chris
In learning how to use LaTeX for showing math formulas, I have found the online LaTeX Previewer extremely useful. You type in LaTeX, hit a button, and it shows you what it looks like.
Posted in: Misc
Tags: latex, Math