The Larrabee instruction set has been revealed and Michael Abrash gives a first look:
[...] MMX, SSE and Altivec. They supported vector arithmetic, but could only read and write data from contiguous locations in memory, rather than random-access as Larrabee. So SSE was only useful for operations on data that was naturally vector-like: RGBA colors, XYZW coordinates in 3D graphics, and so on. The Larrabee instructions are suitable for vectorizing any code meeting the conditions above, even when the code was not written to operate on vector-like quantities. It can benefit every type of application!
Serving HTTP with continuations, or how to make Twisted and Stackless Python play nice
In my neverending quest for The One True Way To Serve Web Applications I've recently become interested in the concept of continuations and how they can make single-process, asynchronous programming as easy and intuitive as simple blocking calls. A recent example is NeverBlock, built on Ruby 1.9 fibers.
Getting really close: plain Python >=2.5 and Twisted
Having already some experience with Twisted I decided to try to (ab)use Python 2.5 generators to implement continuations. The basic idea was to wrap the request logic inside a generator, and have it yield out the Twisted referred whenever it made any I/O operation: (all the boilerplate Twisted code has been left out)
def render_GET(self, request):
self.continuation = self.handle(request)
# self.continuation now contains the generator object
# advance it to at least the first yield
df = self.continuation.next()
# resume the generator on the df callback
df.addCallback(self.resumer)
return server.NOT_DONE_YET
def handle(self, request):
result = yield dbPool.runQuery('SELECT * FROM test')
request.write(result)
def resumer(self, result):
self.continuation.send(result)
This worked fine. It was also useless for any practical purpose since generators only work from inside a single call frame, so it was impossible to add any complex code to the handle() method. Any and all request processing that potentially made any I/O operation (and thus yielded the deferred) would have need to reside directly in the body of the method.
This is where anybody else would just use threads and call it a day. Not me of course. I wanted to follow the Twisted philosophy of never blocking and handling thousands of request on a single process/thread, but deferreds force the programmer to use callbacks and chop up your logic in many methods.
Enter Stackless Python
Stackless Python is a partial reimplentation of Python that adds a continuation library to the language. Its principle is very simple: you can have any arbitrary amount of in-process, very lightweight, cooperative threads, called tasklets. In the main tasklet (the one your app starts on) you call stackless.run() as the main loop of your app. And inside each tasklet you call stackless.schedule() when you want to give up the CPU to the runloop. A simple inter-tasklet communication primitive rounds up the library (stackless.channel).
My idea then was to create a new tasklet for every request Twisted received, and have them to be scheduled() away when they found themselves waiting on the completion of a deferred. The deferred callback would restart the tasklet, somehow. It turns out there was already an entire collection of examples combining Twisted and Stackless in Google Code.
In particular the TwistedWebserverThreaded.py example was basically what I wanted, but it highlighted a problem: both Stackless and Twisted make use of a runloop, and both of them assume they are the runloop. The example made use of a separate thread to run the Twisted runloop. I decided there was a better way, if giving up on a Stackless feature was an option.
Giving up on preemption
Stackless supports preemption by the way of allowing its runloop to be called for a certain amount of bytecodes. This allows it to reschedule a different tasklet, but it also introduces all the annoying problems associated with preemptive code, without any of the benefits (no true multicore execution). I was more than happy to give up on it since all I wanted was continuations.
And so I had an idea: make the Twisted runloop run for as long as possible, but make sure it always calls schedule() on the completion of any I/O. Since all I asked from Stackless was multiplexing I/O in a single process, and not threading emulation, this was perfectly fine for me.
Example
I've implemented an example on how to integrate Twisted and Stackless in a HTTP server. Every request spawns a tasklet, and it waits on a channel whenever it has to wait for the completion of a deferred. The callback of the deferred will signal that channel for completion and send the deferred result. Since the callback execution happens in the context of the Twisted runloop tasklet it gives us the opportunity to call schedule() inside it, thus giving the request tasklets the chance to run again.
In the example I present a nonsensical request processor that does 3 asynchronous SQL requests, and from time to time 1 HTTP request and a very simple inter-tasklet communication in the form of a very lousy chat.
Base controller
class ResumableController():
def tasklet(self):
self.return_channel = NWChannel()
self.me = stackless.getcurrent()
self.stamp = random.randint(0, 1000000000)
self.handle()
The chat receiver is just a blocking call into a Stackless channel.
# chat methods
def waitForChat(self):
return chatChannel.receive()
The database uses the standard Twisted Enterprise asynchronous database pool.
# db methods
def waitForSQL(self, sql):
d = dbPool.runQuery(sql)
return self.waitForDeferred(d, self.succesfulSQLDeferred)
def succesfulSQLDeferred(self, result):
r = pickle.dumps(result)
self.return_channel.send_nowait(r)
self.reschedule()
The HTTP requester uses the Twisted HTTP client.
# http client methods
def waitForHTTPClient(self, url):
d = client.getPage(url)
return self.waitForDeferred(d, self.succesfulHTTPDeferred)
def succesfulHTTPDeferred(self, result):
r = html.PRE(result)
self.return_channel.send_nowait(r)
self.reschedule()
Common methods, including the common handling for deferreds.
def waitForDeferred(self, d, success):
d.addCallback(success)
d.addErrback(self.errorDeferred)
return self.waitForChannel()
def waitForChannel(self):
return self.return_channel.receive()
def errorDeferred(self, fault):
self.return_channel.send_exception_nowait(fault.type, fault.value)
self.reschedule()
def reschedule(self):
if stackless.getcurrent() != self.me:
stackless.schedule()
Twisted resource
The reactor.callLater(0.0, stackless.schedule) is the equivalent of the deferreds giving up the control to the Stackless runloop, only in a different way since we need to return first from the render_GET method.
class ClientRequestHandler(resource.Resource):
isLeaf = True
def __init__(self):
resource.Resource.__init__(self)
def render_GET(self, request):
request.write('request arrives<br>')
c = ExampleController(request)
stackless.tasklet(c.tasklet)()
request.write('still in the reactor tasklet<br>')
reactor.callLater(0.0, stackless.schedule)
return server.NOT_DONE_YET
Example controller
handle is where all the fun is. All the wait* methods block, but they also inmediately give up the control to the Stackless runloop, which ultimately gives control to the Twisted runloop. When a deferred calls back the control is given up again from Twisted to Stackless, which wakes up the request tasklet. The result is a working continuation based HTTP server in Python, with full I/O-based scheduling!
class ExampleController(ResumableController):
def __init__(self, request):
self.request = request
def handle(self):
self.request.write('hi, we are now inside the request tasklet<br>')
# replace this query with something valid for your DB
sql = 'select * from test limit 10'
self.request.write('<br><br>QUERY 1:')
self.request.write(self.waitForSQL(sql))
if (random.randint(0, 9) <= 1):
self.request.write('<br><br>HTTP:')
self.request.write(self.waitForHTTPClient('http://www.google.com/'))
self.request.write('<br><br>QUERY 2:')
self.request.write(self.waitForSQL(sql))
if (random.randint(0, 9) <= 1):
self.request.write('<br><br>CHAT PRODUCER:')
self.request.write('sending to {0:d} clients'.format(chatChannel.balance))
chatChannel.send('hi from {0:d} (and to other {1:d} chaps)'.format(self.stamp, chatChannel.balance))
if (random.randint(0, 9) <= 2):
self.request.write('<br><br>CHAT CONSUMER:')
self.request.write(self.waitForChat())
self.request.write('<br><br>QUERY 3:')
self.request.write(self.waitForSQL(sql))
self.request.finish()
Download
You can download the full example from here: server-twisted-stackless.zip. If you don't have a database on hand just comment out the pool creation and replace the database calls with Google.com fetches for example. If some requests appear to hang it just means they are waiting for chat messages from other requests. This is expected since it tries to emulate the behavior of Comet applications. Comment out the waitForChat call if it annoys you.
Patrick Thomson draws a parallel between Haskell monads and jQuery method chaining. The result is the easiest introduction to the concept of monads I've ever read.
This week at CES Palm finally announced its new phone, the Pre, powered by its new Linux based OS, the webOS (spanish speakers will have a laugh with this name), which is based on a HTML/JS stack for application development. Ars Technica publishes two articles with in-depth information on the Pre's hardware (it's nice to finally see mainstream OMAP3/Cortex A8 devices) and its app store and SDK (links via OSNews).
Personally I have my doubts about HTML/JS based development. In performance terms it won't be able to compete with the iPhone, which has fully native application support (the APIs may be crippled, but that's a different problem). With the Dalvik VM in Android apparently being extremely slow it looks like only the iPhone, Symbian and WinMo platforms are willing to share the full power of the hardware with third party developers.
Excellent article on a modular grid layout by Jason Santa Maria. Grid layouts are inspired by the same rules that have been in use by print design for decades. The example (with grid) shows the extreme simplicity of the CSS framework compared to the rich visual results. Jason's homepage is also a stunning example of a grid layout.
The web site for An Event Apart was recently redesigned by Jeffrey Zeldman, with programming by Eric Meyer. In his post Eric explains how wanting to have hrefs on any element made him chose HTML 5, and the gotchas and problems he found with current browsers.
A very interesting introduction to Haskell for programmers used to imperative languages: Haskell for C Programmers.
Ilya Grigorik has posted a very nice introduction to bloom filters in Ruby:
Instead of storing the key-value pairs, as a regular hash table would, a Bloom filter will give you only one piece of information: true or false based on the presence of a key in the hash table (equivalent to Enumerable's include?() function in Ruby). This relaxation allows the filter to be represented with a much smaller piece of memory: instead of storing each value, a bloom filter is simply an array of bits indicating the presence of that key in the filter.
(Not to be confused with the pretty kind of bloom filters)
Git, which started as a loose collection of shell scripts, has come a long way and is being adopted by many open source projects. Why Git is Better Than X offers a comparison between Git and other popular SCM.
Google is going to sell an unlocked G1-like phone for Android developers, without ROM signing and without SIM locking. It's not clear if the bundled ROM image will include all the applications of the T-Mobile G1 so it may be not usable as a normal phone, but this is a step in the right direction.
MySQL 5.0 has been without community releases for 4 months, and now Sun ships a "GA" version of MySQL 5.1 that, according to the original MySQL founder, is full of crashing bugs. It may be the time to seriously consider PosgreSQL and to keep an eye on Drizzle.
C. Enrique Ortiz reports on the recent public request for comments call from the MSA 2 working group. MSA 2 handsets and applications won't be anywhere near Android in terms of capabilities and above all platform integration, but it is a step in the right direction.
Will Larson writes about deploying Django with Fabric. Fabric is a Capistrano-like tool for remote deployment written in Python.
Gmail for mobile 2.0 is out. I've been an avid user of the 1.x version for years and here is my quick review:
The Good:
- Multiple account support, including Gmail for Domains support.
- Limited offline mode.
The Bad:
- The "Refresh" softkey is gone, replaced by a "Hide" one. "Refresh" is now a menu option and it apparently has no key shortcut. Thanks but no thanks. Most multitasking phones nowadays feature a dedicated "Home" or "Menu" key of some sort that hides the application. It would be fine to hide "Refresh" away in the menu if the application had push synchronization but that's not the case (not that it could anyway, long-lived push on J2ME is impossible on almost every phone. That's why "Refresh" should be in a easy, visible shortcut and not in the menu.)
I'm keeping this version for the multiple account support but I hope Google brings back the "Refresh" softkey or at least maps a key shortcut to it. Right now "Hide" is completely useless on my Nokia E61.
Android has been finally been made fully open source. Previously only a small part of Android was open source and, for example, it was impossible to build it for a new architecture.
Opera has announced MAMA, a new search engine that indexes the actual markup and styling information of a web page:
MAMA is a structural Web-page search engine—it trawls Web pages and returns results detailing page structures, including what HTML, CSS, and script is used on it, as well as whether the HTML validates.
The PyPy team held recently a coding sprint and its status blog has been updated with notes on the JIT generator and C++ bindings.
Dave Herman explores the strange interactions between eval() and variable scoping in Javascript.
Glenn Gillen has a nice writeup on the new features presented in MerbCamp. And here is day 2.
Excellent overview on the cruel dynamics of the iPhone App Store:
Think of it as if there was a single Top 40 music radio station everybody listened to: moving up or down in the list has a huge impact on sales, and dropping from the list means your sales will be easily reduced by one or two orders of magnitude.
Ilya Grigorik implements the stale-while-revalidate Cache-Control proposal with EventMachine and memcached:
The application logic is simple: if we have never seen this request, process, and cache it; if we've seen this request, and the cache is valid, then render response; if we've seen this response, but the cache is stale, render the stale version immediately, and then continue the process to update the cache. Also, to avoid the 'stampeding' effect, we've added a flag to mark a request as in-progress, to indicate that an application server is working on updating the cache.
Big list of Django tips. Note: if your retinas start melting while reading the list try a zap colors bookmarklet.
