Archive

Archive for the ‘Computer Science’ Category

Stanford Free Classes

January 10, 2012 Leave a comment

I recently read an article by a Stanford student, Ben Rudolph, in a Machine Learning class that uses a “flipped classroom” model: the students watch lectures at home and then go into class to talk to the professor about the homework. This is an interesting model that’s been gaining some attention lately, and I have some experience with it. I mostly agree with what the article says, and I have a few additional points.

Rudolph complains that the online questions are too easy. As reported in another article,

Mr. Rudolph took particular exception to the programming exercises, in which the computer automatically informed students whether or not they got 100 percent on the task. “It’s so black and white,” he tells Wired Campus. “They have to make it easy enough so everyone can get 100 percent, basically. In the past I’ve turned in programming assignments, and only the really smart kids got stellar scores, because they went above and beyond. This model kind of discourages that.”

Those are some poorly constructed programming assignments. A computer can still grade a difficult programming assignment, because a computer program can—by definition—run on a computer, and the computer can check if it gives the correct output. For example, TopCoder offers a series of programming challenges that could take anywhere from five minutes to a few hours, and all of them are graded by computer. The computer grades not only on accuracy, but also on speed, memory efficiency, and code concision.

The main problem with this is that if you’re stuck, there’s not much you can do by yourself to figure out the next step. I suggest that students work on sophisticated problems like those at TopCoder, and the students who are struggling can talk to a professor about how to get the program a little closer to where it needs to be.

The Interpreter, Part 1 Conclusion

August 21, 2010 4 comments

The final chapter in one man’s first journey to write an interpreter.

Jackson spent the next month or so adding smaller features and fixing bugs. At last, The Interpreter Version 1.0 was ready for release. He named his language Simfpl: Simple Interpreted Mathematically-Oriented Functional Programming Language. He put the source code on the internet for everyone to see.

To be able to use it, you first need to install GMP and MPFR.

EDIT 7/1/13: I’ve moved the code to GitHub.

Categories: The Interpreter

The Interpreter, Chapter 7

July 20, 2010 1 comment

The continuing story of one man’s quest to write an interpreter.

Jackson’s interpreter was coming along smoothly. But for it to have a complete foundation, he still needed to add one final feature: user-defined functions.

Jackson wanted to keep the language definition simple. This would avoid special syntax, making the interpreter easier to write, and also would make the language easier to understand and even more extensible. If statements and while loops were defined not with special syntax, but as functions. Similarly, Jackson wanted function definitions to themselves be functions.

So he set up a “def” function, which would be a function to define other functions. It would take three arguments: the function name, the variable list, and the function definition. So far, so good. He implemented this pretty quickly.

But there was a problem: it was impossible to create a recursive function. If the programmer created a function (f) and tried to call it within its own body, the expression would not compile correctly. For (def) to act like a function it had to work at run-time, but the compiler needed to know that a reference to (f) was a function.

After brooding over this problem for some time, Jackson came up with a solution. He created a new data type called a function shell, designed to hold just enough information for the compiler to know what to do. Whenever the compiler saw “def”, it would look for the function name. Then it would find all other references to the function name and convert them into function shells. The program would compile knowing that (f) was a function, and would actually define the function after the (def) function was called at runtime.

All of the foundational features had been implemented. But there was still work to do. Jackson had to implement a few smaller features, fix the myriads of bugs that had sprung up, and optimize the code.

Stay tuned for the exciting continuation of The Interpreter!

Categories: The Interpreter

The Interpreter, Chapter 6

The continuing story of one man’s quest to write an interpreter.

Last time, our hero greatly improved the concept and implementation of the interpreter. This time, he took that even further. Soon, if statements and while loops were working. Previously, control structures had been managed by a single 500-line if statement. Not a pretty sight. The new and improved interpreter treated control structures as functions — even the statement separator (which in this case was a semicolon).

What about the interpreter’s standard output? The interpreter was printing a lot of stuff on the screen, so this was important. But the standard output was rather lacking in quality. If you tried to print out more than 1 MB of text, the program would crash.

Jackson spent over eight hours revising the output to be cleaner and catch more errors. The most useful development was a formatted output function, which caught virtually every error possible. All the other output functions could lean on this one core function. The interpreter expanded from having about three output functions to about ten — but each function was cleaner, simpler, and more versatile.

Almost everything from the old interpreter had been re-implemented, and was working better than ever. It was time to move forward. Jackson now had to implement a critical piece, something no language can do without: functions.

To be continued. . . .

Categories: The Interpreter

Human Data Structures

July 1, 2010 1 comment

In computer science, we frequently have to deal with how to represent our data. Different data structures provide different benefits. It is interesting to consider what data structures we humans are good at.

Probably the most common data structure is an array. It provides random access to any element, but inserting a new element is slow.

Humans are very bad at arrays. Unless the array is very short, we cannot achieve constant-time random access the way a computer can. To access the Nth element, if N is much greater than four or five, we have to simply count up to it.

Humans are much better at linked lists. Like a linked list, elements further down the list are harder to find. We also have a good intuitive understanding of linked lists. While an array is like a grocery list, a linked list is like a deck of cards.

What about associative arrays? The fastest computer implementation of an associative array is a hash table. But, because hash tables are usually implemented using arrays, humans are of course not very good at them. Also, being much slower at math than computers, we would have trouble computing hash functions.

If we want to use an associative array, the best bet for us humans is to implement it with a binary tree. This is how we use dictionaries. There is an alphabetized list of keys (words), each with a corresponding value (definition). We don’t immediately know the location of each word. Instead, we have to do something like a binary search. Let’s say you’re looking up the word “pear”. First you turn to about two-thirds of the way through the dictionary, where you think the P’s will be. Then you turn a few dozen pages forward or backward until you get to the P’s. Then look for ‘Pe’, then ‘Pea’, then ‘Pear’. This is something like a binary search algorithm.

The main advantage that computers have over humans is their ability to quickly access any point in memory. We as humans have to search through the whole list. Perhaps this is why it’s a good idea to use computers to hold our data.

Categories: Computer Science

The Interpreter, Chapter 5

June 17, 2010 2 comments

The continuing story of one man’s quest to write an interpreter.

Jackson’s interpreter was working great. Most of the features he wanted were working. There were still a few annoying bugs that needed to be worked out, but all in all, things were going very well.

The way that the interpreter was changing scope was painfully slow. Jackson did some research to try to find a solution, but decided that the only way to speed up the program would be to make some fundamental changes. One thing led to another, and before he knew it Jackson was up until midnight creating plans for a whole new evaluation system.

Up to this point, the interpreter’s interpretation system was pretty jumbled. Compiler and interpreter were hopelessly mixed, with no possibility of separation. Some parts were being evaluated before optimization, and some parts after. It just didn’t make much sense, and was hard to work with. To make any serious headway, things had to change.

The most important alteration would be to separate compile time and run time. The whole internal representation of functions would have to change as well. Blocks, instead of being unoptimized chunks of code, would become fully optimized expressions.

Of course, big changes also means big mistakes. Jackson spent hours tracking down all the new bugs that had been created.

Conceptually, the new interpreter was far superior to the old one. It was simpler, easier to understand, theoretically faster and just a whole lot neater overall. The real trick was making the code look like the concept.

Chapter 6

Categories: The Interpreter

The Interpreter, Chapter 4

June 3, 2010 1 comment

The continuing story of one man’s quest to write an interpreter.

Most of the basic data types — numbers, strings, arrays, lists, hashes — were supported. Next, the interpreter needed control structures. But before those could be added, they needed some sort of support. So Jackson created blocks.

Blocks began as chunks of code in between curly brackets. They were pretty simple. Getting them to work was a bit tricky, but nothing too difficult. Even before control structures were implemented, blocks still proved useful. They could be packaged up, and the same code could be called multiple times.

Jackson used this to implement if statements and while loops. To keep control structures as pure as possible, he treated if statements and while loops as functions. While loops became functions that took two blocks, one for the condition and one for the body. If statements became functions that took either two or three blocks, depending on whether there was an “else” block. After blocks were implemented, if statements and while loops were easy.

Now that blocks were possible, they could be used not just for control structures but also could be passed as arguments into functions. The first such function was the times() function, which took two arguments: a number and a block. It would execute the block a number of times equal to the given number, and it looked like 10 times { do stuff }.

For blocks to really achieve their full potential, they needed to be able to take variables. Implementing this required making the internal data structure a good deal more complicated. But when it was finished, the interpreter was all the more powerful.

At last, Jackson’s programming language was Turing complete. He successfully implemented a program to calculate every prime number from 1 to 100.

Chapter 5

Categories: The Interpreter