<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Introduction to Foolscap</title> <style src="stylesheet-unprocessed.css"></style> </head> <body> <h1>Introduction to Foolscap</h1> <h2>Introduction</h2> <p>Suppose you find yourself in control of both ends of the wire: you have two programs that need to talk to each other, and you get to use any protocol you want. If you can think of your problem in terms of objects that need to make method calls on each other, then chances are good that you can use the Foolscap protocol rather than trying to shoehorn your needs into something like HTTP, or implementing yet another RPC mechanism.</p> <p>Foolscap is based upon a few central concepts:</p> <ul> <li><em>serialization</em>: taking fairly arbitrary objects and types, turning them into a chunk of bytes, sending them over a wire, then reconstituting them on the other end. By keeping careful track of object ids, the serialized objects can contain references to other objects and the remote copy will still be useful. </li> <li><em>remote method calls</em>: doing something to a local proxy and causing a method to get run on a distant object. The local proxy is called a <code class="API" base="foolscap.referenceable">RemoteReference</code>, and you <q>do something</q> by running its <code>.callRemote</code> method. The distant object is called a <code class="API" base="foolscap.referenceable">Referenceable</code>, and it has methods like <code>remote_foo</code> that will be invoked.</li> </ul> <p>Foolscap is the descendant of Perspective Broker (which lived in the twisted.spread package). For many years it was known as "newpb". A lot of the API still has the name "PB" in it somewhere. These will probably go away sooner or later.</p> <p>A "foolscap" is a size of paper, probably measuring 17 by 13.5 inches. A twisted foolscap of paper makes a good fool's cap. Also, "cap" makes me think of capabilities, and Foolscap is a protocol to implement a distributed object-capabilities model in python.</p> <h2>Getting Started</h2> <p>Any Foolscap application has at least two sides: one which hosts a remotely-callable object, and another which calls (remotely) the methods of that object. We'll start with a simple example that demonstrates both ends. Later, we'll add more features like RemoteInterface declarations, and transferring object references.</p> <p>The most common way to make an object with remotely-callable methods is to subclass <code class="API" base="foolscap.referenceable">Referenceable</code>. Let's create a simple server which does basic arithmetic. You might use such a service to perform difficult mathematical operations, like addition, on a remote machine which is faster and more capable than your own<span class="footnote">although really, if your client machine is too slow to perform this kind of math, it is probably too slow to run python or use a network, so you should seriously consider a hardware upgrade</span>.</p> <pre class="python"> from foolscap.api import Referenceable class MathServer(Referenceable): def remote_add(self, a, b): return a+b def remote_subtract(self, a, b): return a-b def remote_sum(self, args): total = 0 for a in args: total += a return total myserver = MathServer() </pre> <p>On the other end of the wire (which you might call the <q>client</q> side), the code will have a <code class="API" base="foolscap.referenceable">RemoteReference</code> to this object. The <code>RemoteReference</code> has a method named <code class="API" base="foolscap.referenceable.RemoteReference">callRemote</code> which you will use to invoke the method. It always returns a Deferred, which will fire with the result of the method. Assuming you've already acquired the <code>RemoteReference</code>, you would invoke the method like this:</p> <pre class="python"> def gotAnswer(result): print "result is", result def gotError(err): print "error:", err d = remote.callRemote("add", 1, 2) d.addCallbacks(gotAnswer, gotError) </pre> <p>Ok, now how do you acquire that <code>RemoteReference</code>? How do you make the <code>Referenceable</code> available to the outside world? For this, we'll need to discuss the <q>Tub</q>, and the concept of a <q>FURL</q>.</p> <h2>Tubs: The Foolscap Service</h2> <p>The <code class="API" base="foolscap.pb">Tub</code> is the container that you use to publish <code>Referenceable</code>s, and is the middle-man you use to access <code>Referenceable</code>s on other systems. It is known as the <q>Tub</q>, since it provides similar naming and identification properties as the <a href="http://www.erights.org/">E language</a>'s <q>Vat</q><span class="footnote">but they do not provide quite the same insulation against other objects as E's Vats do. In this sense, Tubs are leaky Vats.</span>. If you want to make a <code>Referenceable</code> available to the world, you create a Tub, tell it to listen on a TCP port, and then register the <code>Referenceable</code> with it under a name of your choosing. If you want to access a remote <code>Referenceable</code>, you create a Tub and ask it to acquire a <code>RemoteReference</code> using that same name.</p> <p>The <code>Tub</code> is a Twisted <code class="API" base="twisted.application.service">Service</code> subclass, so you use it in the same way: once you've created one, you attach it to a parent Service or Application object. Once the top-level Application object has been started, the Tub will start listening on any network ports you've requested. When the Tub is shut down, it will stop listening and drop any connections it had established since last startup. If you have no parent to attach it to, you can use <code>startService</code> and <code>stopService</code> on the Tub directly.</p> <p>Note that no network activity will occur until the Tub's <code>startService</code> method has been called. This means that any <code>getReference</code> or <code>connectTo</code> requests that occur before the Tub is started will be deferred until startup. If the program forgets to start the Tub, these requests will never be serviced. A message to this effect is added to the twistd.log file to help developers discover this kind of problem.</p> <h3>Making your Tub remotely accessible</h3> <p>To make any of your <code>Referenceable</code>s available, you must make your Tub available. There are three parts: give it an identity, have it listen on a port, and tell it the protocol/hostname/portnumber at which that port is accessibly to the outside world.</p> <p>In general, the Tub will generate its own identity, the <em>TubID</em>, by creating an SSL public key certificate and hashing it into a suitably-long random-looking string. This is the primary identifier of the Tub: everything else is just a <em>location hint</em> that suggests how the Tub might be reached. The fact that the TubID is tied to the public key allows FURLs to be <q>secure</q> references (meaning that no third party can cause you to connect to the wrong reference). You can also create a Tub with a pre-existing certificate, which is how Tubs can retain a persistent identity over multiple executions.</p> <p>You can also create an <code>UnauthenticatedTub</code>, which has an empty TubID. Hosting and connecting to unauthenticated Tubs do not require the pyOpenSSL library, but do not provide privacy, authentication, connection redirection, or shared listening ports. The FURLs that point to unauthenticated Tubs have a distinct form (starting with <code>pbu:</code> instead of <code>pb:</code>) to make sure they are not mistaken for authenticated Tubs. Foolscap uses authenticated Tubs by default.</p> <p>Having the Tub listen on a TCP port is as simple as calling <code class="API" base="foolscap.pb.Tub">listenOn</code> with a <code class="API" base="twisted.application">strports</code>-formatted port specification string. The simplest such string would be <q>tcp:12345</q>, to listen on port 12345 on all interfaces. Using <q>tcp:12345:interface=127.0.0.1</q> would cause it to only listen on the localhost interface, making it available only to other processes on the same host. The <code>strports</code> module provides many other possibilities.</p> <p>The Tub needs to be told how it can be reached, so it knows what host and port to put into the FURLs it creates. This location is simply a string in the format <q>host:port</q>, using the host name by which that TCP port you've just opened can be reached. Foolscap cannot, in general, guess what this name is, especially if there are NAT boxes or port-forwarding devices in the way. If your machine is reachable directly over the internet as <q>myhost.example.com</q>, then you could use something like this:</p> <pre class="python"> from foolscap.api import Tub tub = Tub() tub.listenOn("tcp:12345") # start listening on TCP port 12345 tub.setLocation("myhost.example.com:12345") </pre> <h3>Registering the Referenceable</h3> <p>Once the Tub has a Listener and a location, you can publish your <code>Referenceable</code> to the entire world by picking a name and registering it:</p> <pre class="python"> furl = tub.registerReference(myserver, "math-service") </pre> <p>This returns the <q>FURL</q> for your <code>Referenceable</code>. Remote systems will use this FURL to access your newly-published object. The registration just maps a per-Tub name to the <code>Referenceable</code>: technically the same <code>Referenceable</code> could be published multiple times, under different names, or even be published by multiple Tubs in the same application. But in general, each program will have exactly one Tub, and each object will be registered under only one name.</p> <p>In this example (if we pretend the generated TubID was <q>ABCD</q>), the FURL returned by <code>registerReference</code> would be <code>"pb://ABCD@myhost.example.com:12345/math-service"</code>.</p> <p>If you do not provide a name, a random (and unguessable) name will be generated for you. This is useful when you want to give access to your <code>Referenceable</code> to someone specific, but do not want to make it possible for someone else to acquire it by guessing the name.</p> <p>To use an unauthenticated Tub instead, you would do the following:</p> <pre class="python"> from foolscap.api import UnauthenticatedTub tub = UnauthenticatedTub() tub.listenOn("tcp:12345") # start listening on TCP port 12345 tub.setLocation("myhost.example.com:12345") furl = tub.registerReference(myserver, "math-service") </pre> <p>In this case, the FURL would be <code>"pbu://myhost.example.com:12345/math-service"</code>. The deterministic nature of this form makes it slightly easier to throw together quick-and-dirty Foolscap applications, since you only need to hard-code the target host and port into the client side program. However any serious application should just used the default authenticated form and use a full FURL as their starting point. Note that the FURL can come from anywhere: typed in by the user, retrieved from a web page, or hardcoded into the application.</p> <h4>Using a persistent certificate</h4> <p>The Tub uses a TLS public-key certificate as the base of all its cryptographic operations. If you don't give it one when you create the Tub, it will generate a brand-new one.</p> <p>The TubID is simply the hash of this certificate, so if you are writing an application that should have a stable long-term identity, you will need to insure that the Tub uses the same certificate every time your app starts. The easiest way to do this is to pass the <code>certFile=</code> argument into your <code>Tub()</code> constructor call. This argument provides a filename where you want the Tub to store its certificate. The first time the Tub is started (when this file does not exist), the Tub will generate a new certificate and store it here. On subsequent invocations, the Tub will read the earlier certificate from this location. Make sure this filename points to a writable location, and that you pass the same filename to <code>Tub()</code> each time.</p> <h4>Using a Persistent FURL</h4> <p>It is often useful to insure that a given Referenceable's FURL is both unguessable and stable, remaining the same from one invocation of the program that hosts it to the next. One (bad) way to do this is to have the programmer choose an unguessable name, embed it in the program, and pass it into <code>registerReference</code> each time the program runs, but of course this means that the name will be visible to anyone who sees the source code for the program, and the same name will be used by all copies of the program everywhere.</p> <p>A better approach is to use the <code>furlFile=</code> argument. This argument provides a filename that is used to hold the stable FURL for this object. If the furlfile exists when <code>registerReference</code> is called, the Tub will use the name inside it when constructing the new FURL. If it doesn't exist, it will create a new (unguessable) name. The new FURL will always be written into the furlfile afterwards. In addition, the tubid in the old FURL will be checked against the current Tub's tubid to make sure it matches. (this means that if you use furlFile=, you should also use the certFile= argument when constructing the Tub).</p> <h3>Retrieving a RemoteReference</h3> <p>On the <q>client</q> side, you also need to create a Tub, although you don't need to perform the (<code>listenOn</code>, <code>setLocation</code>, <code>registerReference</code>) sequence unless you are also publishing <code>Referenceable</code>s to the world. To acquire a reference to somebody else's object, just use <code class="API" base="foolscap.pb.Tub">getReference</code>:</p> <pre class="python"> from foolscap.api import Tub tub = Tub() tub.startService() d = tub.getReference("pb://ABCD@myhost.example.com:12345/math-service") def gotReference(remote): print "Got the RemoteReference:", remote def gotError(err): print "error:", err d.addCallbacks(gotReference, gotError) </pre> <p><code>getReference</code> returns a Deferred which will fire with a <code>RemoteReference</code> that is connected to the remote <code>Referenceable</code> named by the FURL. It will use an existing connection, if one is available, and it will return an existing <code>RemoteReference</code>, it one has already been acquired.</p> <p>Since <code>getReference</code> requests are queued until the Tub starts, the following will work too. But don't forget to call <code>tub.startService()</code> eventually, otherwise your program will hang forever.</p> <pre class="python"> from foolscap.api import Tub tub = Tub() d = tub.getReference("pb://ABCD@myhost.example.com:12345/math-service") def gotReference(remote): print "Got the RemoteReference:", remote def gotError(err): print "error:", err d.addCallbacks(gotReference, gotError) tub.startService() </pre> <h3>Complete example</h3> <p>Here are two programs, one implementing the server side of our remote-addition protocol, the other behaving as a client. This first example uses an unauthenticated Tub so you don't have to manually copy a FURL from the server to the client. Both of these are standalone programs (you just run them), but normally you would create an <code class="API" base="twisted.application.service">Application</code> object and pass the file to <code>twistd -noy</code>. An example of that usage will be provided later.</p> <a href="listings/pb1server.py" class="py-listing" skipLines="2">pb1server.py</a> <a href="listings/pb1client.py" class="py-listing" skipLines="2">pb1client.py</a> <pre class="shell"> % doc/listings/pb1server.py the object is available at: pbu://localhost:12345/math-service </pre> <pre class="shell"> % doc/listings/pb1client.py got a RemoteReference asking it to add 1+2 the answer is 3 % </pre> <p>The second example uses authenticated Tubs. When running this example, you must copy the FURL printed by the server and provide it as an argument to the client.</p> <a href="listings/pb2server.py" class="py-listing" skipLines="2">pb2server.py</a> <a href="listings/pb2client.py" class="py-listing" skipLines="2">pb2client.py</a> <pre class="shell"> % doc/listings/pb2server.py the object is available at: pb://abcd123@localhost:12345/math-service </pre> <pre class="shell"> % doc/listings/pb2client.py pb://abcd123@localhost:12345/math-service got a RemoteReference asking it to add 1+2 the answer is 3 % </pre> <h3>FURLs</h3> <p>In Foolscap, each world-accessible Referenceable has one or more FURLs which are <q>secure</q>, where we use the capability-security definition of the term, meaning those FURLs have the following properties:</p> <ul> <li>The only way to acquire the FURL is either to get it from someone else who already has it, or to be the person who published it in the first place.</li> <li>Only that original creator of the FURL gets to determine which Referenceable it will connect to. If your <code>tub.getReference(url)</code> call succeeds, the Referenceable you will be connected to will be the right one.</li> </ul> <p>To accomplish the first goal, FURLs must be unguessable. You can register the reference with a human-readable name if your intention is to make it available to the world, but in general you will let <code>tub.registerReference</code> generate a random name for you, preserving the unguessability property.</p> <p>To accomplish the second goal, the cryptographically-secure TubID is used as the primary identifier, and the <q>location hints</q> are just that: hints. If DNS has been subverted to point the hostname at a different machine, or if a man-in-the-middle attack causes you to connect to the wrong box, the TubID will not match the remote end, and the connection will be dropped. These attacks can cause a denial-of-service, but they cannot cause you to mistakenly connect to the wrong target.</p> <p>Obviously this second property only holds if you use SSL. If you choose to use unauthenticated Tubs, all security properties are lost.</p> <p>The format of a FURL, like <code>pb://abcd123@example.com:5901,backup.example.com:8800/math-server</code>, is as follows<span class="footnote">note that the FURL uses the same format as an <a href="http://www.waterken.com/dev/YURL/httpsy/">HTTPSY</a> URL</span>:</p> <ol> <li>The literal string <code>pb://</code></li> <li>The TubID (as a base32-encoded hash of the SSL certificate)</li> <li>A literal <code>@</code> sign</li> <li>A comma-separated list of <q>location hints</q>. Each is one of the following: <ul> <li>TCP over IPv4 via DNS: <code>HOSTNAME:PORTNUM</code></li> <li>TCP over IPv4 without DNS: <code>A.B.C.D:PORTNUM</code></li> <li>TCP over IPv6: (TODO, maybe <code>tcp6:HOSTNAME:PORTNUM</code> ?</li> <li>TCP over IPv6 w/o DNS: (TODO, maybe <code>tcp6:[X:Y::Z]:PORTNUM</code></li> <li>Unix-domain socket: (TODO)</li> </ul> Each location hint is attempted in turn. Servers can return a <q>redirect</q>, which will cause the client to insert the provided redirect targets into the hint list and start trying them before continuing with the original list.</li> <li>A literal <code>/</code> character</li> <li>The reference's name</li> </ol> <p>(Unix-domain sockets are represented with only a single location hint, in the format <code>pb://ABCD@unix/path/to/socket/NAME</code>, but this needs some work)</p> <p>FURLs for unauthenticated Tubs, like <code>pbu://example.com:8700/math-server</code>, are formatted as follows:</p> <ol> <li>The literal string <code>pbu://</code></li> <li>A comma-separated list of location hints, as above</li> <li>A literal <code>/</code> character</li> <li>The reference's name</li> </ol> <h2>Clients vs Servers, Names and Capabilities</h2> <p>It is worthwhile to point out that Foolscap is a symmetric protocol. <code>Referenceable</code> instances can live on either side of a wire, and the only difference between <q>client</q> and <q>server</q> is who publishes the object and who initiates the network connection.</p> <p>In any Foolscap-using system, the very first object exchanged must be acquired with a <code>tub.getReference(url)</code> call<span class="footnote">in fact, the very <em>very</em> first object exchanged is a special implicit RemoteReference to the remote Tub itself, which implements an internal protocol that includes a method named <code>remote_getReference</code>. The <code>tub.getReference(url)</code> call is turned into one step that connects to the remote Tub, and a second step which invokes remotetub.callRemote("getReference", refname) on the result</span>, which means it must have been published with a call to <code>tub.registerReference(ref, name)</code>. After that, other objects can be passed as an argument to (or a return value from) a remotely-invoked method of that first object. Any suitable <code>Referenceable</code> object that is passed over the wire will appear on the other side as a corresponding <code>RemoteReference</code>. It is not necessary to <code>registerReference</code> something to let it pass over the wire.</p> <p>The converse of this property is thus: if you do <em>not</em> <code>registerReference</code> a particular <code>Referenceable</code>, and you do <em>not</em> give it to anyone else (by passing it in an argument to somebody's remote method, or return it from one of your own), then nobody else will be able to get access to that <code>Referenceable</code>. This property means the <code>Referenceable</code> is a <q>capability</q>, as holding a corresponding <code>RemoteReference</code> gives someone a power that they cannot acquire in any other way<span class="footnote">of course, the Foolscap connections must be secured with SSL (otherwise an eavesdropper or man-in-the-middle could get access), and the registered name must be unguessable (or someone else could acquire a reference), but both of these are the default.</span></p> <p>In the following example, the first program creates an RPN-style <code>Calculator</code> object which responds to <q>push</q>, <q>pop</q>, <q>add</q>, and <q>subtract</q> messages from the user. The user can also register an <code>Observer</code>, to which the Calculator sends an <code>event</code> message each time something happens to the calculator's state. When you consider the <code>Calculator</code> object, the first program is the server and the second program is the client. When you think about the <code>Observer</code> object, the first program is a client and the second program is the server. It also happens that the first program is listening on a socket, while the second program initiated a network connection to the first. It <em>also</em> happens that the first program published an object under some well-known name, while the second program has not published any objects. These are all independent properties.</p> <p>Also note that the Calculator side of the example is implemented using a <code class="API" base="twisted.application.service">Application</code> object, which is the way you'd normally build a real-world application. You therefore use <code>twistd</code> to launch the program. The User side is written with the same <code>reactor.run()</code> style as the earlier example.</p> <p>The server registers the Calculator instance and prints the FURL at which it is listening. You need to pass this FURL to the client program so it knows how to contact the server. If you have a modern version of Twisted (2.5 or later) and the right encryption libraries installed, you'll get an authenticated Tub (for which the FURL will start with "pb:" and will be fairly long). If you don't, you'll get an unauthenticated Tub (with a relatively short FURL that starts with "pbu:").</p> <a href="listings/pb3calculator.py" class="py-listing" skipLines="2">pb3calculator.py</a> <a href="listings/pb3user.py" class="py-listing" skipLines="2">pb3user.py</a> <pre class="shell"> % twistd -noy doc/listings/pb3calculator.py 15:46 PDT [-] Log opened. 15:46 PDT [-] twistd 2.4.0 (/usr/bin/python 2.4.4) starting up 15:46 PDT [-] reactor class: twisted.internet.selectreactor.SelectReactor 15:46 PDT [-] Loading doc/listings/pb3calculator.py... 15:46 PDT [-] the object is available at: pb://5ojw4cv4u4d5cenxxekjukrogzytnhop@localhost:12345/calculator 15:46 PDT [-] Loaded. 15:46 PDT [-] foolscap.pb.Listener starting on 12345 15:46 PDT [-] Starting factory <Listener at 0x4869c0f4 on tcp:12345 with tubs None> </pre> <pre class="shell"> % doc/listings/pb3user.py \ pb://5ojw4cv4u4d5cenxxekjukrogzytnhop@localhost:12345/calculator event: push(2) event: push(3) event: add event: pop the result is 5 % </pre> <h2>Invoking Methods, Method Arguments</h2> <p>As you've probably already guessed, all the methods with names that begin with <code>remote_</code> will be available to anyone who manages to acquire a corresponding <code>RemoteReference</code>. <code>remote_foo</code> matches a <code>ref.callRemote("foo")</code>, etc. This name lookup can be changed by overriding <code>Referenceable</code> (or, perhaps more usefully, implementing an <code class="API" base="foolscap.ipb">IRemotelyCallable</code> adapter).</p> <p>The arguments of a remote method may be passed as either positional parameters (<code>foo(1,2)</code>), or as keyword args (<code>foo(a=1,b=2)</code>), or a mixture of both. The usual python rules about not duplicating parameters apply.</p> <p>You can pass all sorts of normal objects to a remote method: strings, numbers, tuples, lists, and dictionaries. The serialization of these objects is handled by <a href="specifications/banana.xhtml">Banana</a>, which knows how to convey arbitrary object graphs over the wire. Things like containers which contain multiple references to the same object, and recursive references (cycles in the object graph) are all handled correctly<span class="footnote">you may not want to accept shared objects in your method arguments, as it could lead to surprising behavior depending upon how you have written your method. The <code class="API" base="foolscap.schema">Shared</code> constraint will let you express this, and is described in the <a href="#constraints">Constraints</a> section of this document</span>.</p> <p>Passing instances is handled specially. Foolscap will not send anything over the wire that it does not know how to serialize, and (unlike the standard <code>pickle</code> module) it will not make assumptions about how to handle classes that that have not been explicitly marked as serializable. This is for security, both for the sender (making sure you don't pass anything over the wire that you didn't intend to let out of your security perimeter), and for the recipient (making sure outsiders aren't allowed to create arbitrary instances inside your memory space, and therefore letting them run somewhat arbitrary code inside <em>your</em> perimeter).</p> <p>Sending <code>Referenceable</code>s is straightforward: they always appear as a corresponding <code>RemoteReference</code> on the other side. You can send the same <code>Referenceable</code> as many times as you like, and it will always show up as the same <code>RemoteReference</code> instance. A distributed reference count is maintained, so as long as the remote side hasn't forgotten about the <code>RemoteReference</code>, the original <code>Referenceable</code> will be kept alive.</p> <p>Sending <code>RemoteReference</code>s fall into two categories. If you are sending a <code>RemoteReference</code> back to the Tub that you got it from, they will see their original <code>Referenceable</code>. If you send it to some other Tub, they will (eventually) see a <code>RemoteReference</code> of their own. This last feature is called an <q>introduction</q>, and has a few additional requirements: see the <a href="#introductions">Introductions</a> section of this document for details.</p> <p>Sending instances of other classes requires that you tell Banana how they should be serialized. <code>Referenceable</code> is good for copy-by-reference semantics<span class="footnote">In fact, if all you want is referenceability (and not callability), you can use <code class="API" base="foolscap.referenceable">OnlyReferenceable</code>. Strictly speaking, <code>Referenceable</code> is both <q>Referenceable</q> (meaning it is sent over the wire using pass-by-reference semantics, and it survives a round trip) and <q>Callable</q> (meaning you can invoke remote methods on it). <code>Referenceable</code> should really be named <code>Callable</code>, but the existing name has a lot of historical weight behind it.</span>. For copy-by-value semantics, the easiest route is to subclass <code class="API" base="foolscap.copyable">Copyable</code>. See the <a href="#copyable">Copyable</a> section for details. Note that you can also register an <code class="API" base="foolscap.copyable">ICopyable</code> adapter on third-party classes to avoid subclassing. You will need to register the <code>Copyable</code>'s name on the receiving end too, otherwise Banana will not know how to unserialize the incoming data stream.</p> <p>When returning a value from a remote method, you can do all these things, plus two more. If you raise an exception, the caller's Deferred will have the errback fired instead of the callback, with a <code class="API" base="foolscap.call">CopiedFailure</code> instance that describes what went wrong. The <code>CopiedFailure</code> is not quite as useful as a local <code class="API" base="twisted.python.failure">Failure</code> object would be: see <a href="failures.xhtml">failures.xhtml</a> for details.</p> <p>The other alternative is for your method to return a <code class="API" base="twisted.internet.defer">Deferred</code>. If this happens, the caller will not actually get a response until you fire that Deferred. This is useful when the remote operation being requested cannot complete right away. The caller's Deferred will fire with whatever value you eventually fire your own Deferred with. If your Deferred is errbacked, their Deferred will be errbacked with a <code>CopiedFailure</code>.</p> <h2>Constraints and RemoteInterfaces</h2><a name="constraints" /> <p>One major feature introduced by Foolscap (relative to oldpb) is the serialization <code class="API" base="foolscap.schema">Constraint</code>. This lets you place limits on what kind of data you are willing to accept, which enables safer distributed programming. Typically python uses <q>duck typing</q>, wherein you usually just throw some arguments at the method and see what happens. When you are less sure of the origin of those arguments, you may want to be more circumspect. Enforcing type checking at the boundary between your code and the outside world may make it safer to use duck typing inside those boundaries. The type specifications also form a convenient remote API reference you can publish for prospective clients of your remotely-invokable service.</p> <p>In addition, these Constraints are enforced on each token as it arrives over the wire. This means that you can calculate a (small) upper bound on how much received data your program will store before it decides to hang up on the violator, minimizing your exposure to DoS attacks that involve sending random junk at you.</p> <p>There are three pieces you need to know about: Tokens, Constraints, and RemoteInterfaces.</p> <h3>Tokens</h3> <p>The fundamental unit of serialization is the Banana Token. These are thoroughly documented in the <a href="specifications/banana.xhtml">Banana Specification</a>, but what you need to know here is that each piece of non-container data, like a string or a number, is represented by a single token. Containers (like lists and dictionaries) are represented by a special OPEN token, followed by tokens for everything that is in the container, followed by the CLOSE token. Everything Banana does is in terms of these nested OPEN/stuff/stuff/CLOSE sequences of tokens.</p> <p>Each token consists of a header, a type byte, and an optional body. The header is always a base-128 number with a maximum of 64 digits, and the type byte is always a single byte. The length of the body (if present) is indicated by the number encoded in the header.</p> <p>The length-first token format means that the receiving system never has to accept more than 65 bytes before it knows the type and size of the token, at which point it can make a decision about accepting or rejecting the rest of it.</p> <h3>Constraints</h3> <p>The schema <code>foolscap.schema</code> module has a variety of <code class="API" base="foolscap.schema">Constraint</code> classes that can be applied to incoming data. Most of them correspond to typical Python types, e.g. <code class="API" base="foolscap.schema">ListOf</code> matches a list, with a certain maximum length, and a child <code>Constraint</code> that gets applied to the contents of the list. You can nest <code>Constraint</code>s in this way to describe the <q>shape</q> of the object graph that you are willing to accept.</p> <p>At any given time, the receiving Banana protocol has a single <code>Constraint</code> object that it enforces against the inbound data stream<span class="footnote">to be precise, each <code>Unslicer</code> on the receive stack has a <code>Constraint</code>, and the idea is that all of them get to pass judgement on the inbound token. A useful syntax to describe this sort of thing is still being worked out.</span>.</p> <h3>RemoteInterfaces</h3> <p>The <code class="API" base="foolscap.remoteinterface">RemoteInterface</code> is how you describe your constraints. You can provide a constraint for each argument of each method, as well as one for the return value. You can also specify additional flags on the methods. The convention (which is actually enforced by the code) is to name <code>RemoteInterface</code> objects with an <q>RI</q> prefix, like <code>RIFoo</code>.</p> <p><code>RemoteInterfaces</code> are created and used a lot like the usual <code>zope.interface</code>-style <code>Interface</code>. They look like class definitions, inheriting from <code>RemoteInterface</code>. For each method, the default value of each argument is used to create a <code>Constraint</code> for that argument. Basic types (<code>int</code>, <code>str</code>, <code>bool</code>) are converted into a <code>Constraint</code> subclass (<code class="API" base="foolscap.schema">IntegerConstraint</code>, <code class="API" base="foolscap.schema">StringConstraint</code>, <code class="API" base="foolscap.schema">BooleanConstraint</code>). You can also use instances of other <code>Constraint</code> subclasses, like <code class="API" base="foolscap.schema">ListOf</code> and <code class="API" base="foolscap.schema">DictOf</code>. This <code>Constraint</code> will be enforced against the value for the given argument. Unless you specify otherwise, remote callers must match all the <code>Constraint</code>s you specify, all arguments listed in the RemoteInterface must be present, and no arguments outside that list will be accepted.</p> <p>Note that, like zope.interface, these methods should <b>not</b> include <q><code>self</code></q> in their argument list. This is because you are documenting how <em>other</em> people invoke your methods. <code>self</code> is an implementation detail. <code>RemoteInterface</code> will complain if you forget.</p> <p>The <q>methods</q> in a <code>RemoteInterface</code> should return a single value with the same format as the default arguments: either a basic type (<code>int</code>, <code>str</code>, etc) or a <code>Constraint</code> subclass. This <code>Constraint</code> is enforced on the return value of the method. If you are calling a method in somebody else's process, the argument constraints will be applied as a courtesy (<q>be conservative in what you send</q>), and the return value constraint will be applied to prevent the server from doing evil things to you. If you are running a method on behalf of a remote client, the argument constraints will be enforced to protect <em>you</em>, while the return value constraint will be applied as a courtesy.</p> <p>Attempting to send a value that does not satisfy the Constraint will result in a <code class="API" base="foolscap">Violation</code> exception being raised.</p> <p>You can also specify methods by defining attributes of the same name in the <code>RemoteInterface</code> object. Each attribute value should be an instance of <code class="API" base="foolscap.schema">RemoteMethodSchema</code><span class="footnote">although technically it can be any object which implements the <code class="API" base="foolscap.schema">IRemoteMethodConstraint</code> interface</span>. This approach is more flexible: there are some constraints that are not easy to express with the default-argument syntax, and this is the only way to set per-method flags. Note that all such method-defining attributes must be set in the <code>RemoteInterface</code> body itself, rather than being set on it after the fact (i.e. <code>RIFoo.doBar = stuff</code>). This is required because the <code>RemoteInterface</code> metaclass magic processes all of these attributes only once, immediately after the <code>RemoteInterface</code> body has been evaluated.</p> <p>The <code>RemoteInterface</code> <q>class</q> has a name. Normally this is the (short) classname<span class="footnote"><code>RIFoo.__class__.__name__</code>, if <code>RemoteInterface</code>s were actually classes, which they're not</span>. You can override this name by setting a special <code>__remote_name__</code> attribute on the <code>RemoteInterface</code> (again, in the body). This name is important because it is externally visible: all <code>RemoteReference</code>s that point at your <code>Referenceable</code>s will remember the name of the <code>RemoteInterface</code>s it implements. This is what enables the type-checking to be performed on both ends of the wire.</p> <p>In the future, this ought to default to the <b>fully-qualified</b> classname (like <code>package.module.RIFoo</code>), so that two RemoteInterfaces with the same name in different modules can co-exist. In the current release, these two RemoteInterfaces will collide (and provoke an import-time error message complaining about the duplicate name). As a result, if you have such classes (e.g. <code>foo.RIBar</code> and <code>baz.RIBar</code>), you <b>must</b> use <code>__remote_name__</code> to distinguish them (by naming one of them something other than <code>RIBar</code> to avoid this error. Hopefully this will be improved in a future version, but it looks like a difficult change to implement, so the standing recommendation is to use <code>__remote_name__</code> on all your RemoteInterfaces, and set it to a suitably unique string (like a URI).</p> <p>Here's an example:</p> <pre class="python"> from foolscap.api import RemoteInterface, schema class RIMath(RemoteInterface): __remote_name__ = "RIMath.using-foolscap.docs.foolscap.twistedmatrix.com" def add(a=int, b=int): return int # declare it with an attribute instead of a function definition subtract = schema.RemoteMethodSchema(a=int, b=int, _response=int) def sum(args=schema.ListOf(int)): return int </pre> <h3>Using RemoteInterface</h3> <p>To declare that your <code>Referenceable</code> responds to a particular <code>RemoteInterface</code>, use the normal <code>implements()</code> annotation:</p> <pre class="python"> class MathServer(foolscap.Referenceable): implements(RIMath) def remote_add(self, a, b): return a+b def remote_subtract(self, a, b): return a-b def remote_sum(self, args): total = 0 for a in args: total += a return total </pre> <p>To enforce constraints everywhere, both sides will need to know about the <code>RemoteInterface</code>, and both must know it by the same name. It is a good idea to put the <code>RemoteInterface</code> in a common file that is imported into the programs running on both sides. It is up to you to make sure that both sides agree on the interface. Future versions of Foolscap may implement some sort of checksum-verification or Interface-serialization as a failsafe, but fundamentally the <code>RemoteInterface</code> that <em>you</em> are using defines what <em>your</em> program is prepared to handle. There is no difference between an old client accidentally using a different version of the RemoteInterface by mistake, and a malicious attacker actively trying to confuse your code. The only promise that Foolscap can make is that the constraints you provide in the RemoteInterface will be faithfully applied to the incoming data stream, so that you don't need to do the type checking yourself inside the method.</p> <p>When making a remote method call, you use the <code>RemoteInterface</code> to identify the method instead of a string. This scopes the method name to the RemoteInterface:</p> <pre class="python"> d = remote.callRemote(RIMath["add"], a=1, b=2) # or d = remote.callRemote(RIMath["add"], 1, 2) </pre> <h2>Pass-By-Copy</h2> <p>You can pass (nearly) arbitrary instances over the wire. Foolscap knows how to serialize all of Python's native data types already: numbers, strings, unicode strings, booleans, lists, tuples, dictionaries, sets, and the None object. You can teach it how to serialize instances of other types too. Foolscap will not serialize (or deserialize) any class that you haven't taught it about, both for security and because it refuses the temptation to guess your intentions about how these unknown classes ought to be serialized.</p> <p>The simplest possible way to pass things by copy is demonstrated in the following code fragment:</p> <pre class="python"> from foolscap.api import Copyable, RemoteCopy class MyPassByCopy(Copyable, RemoteCopy): typeToCopy = copytype = "MyPassByCopy" def __init__(self): # RemoteCopy subclasses may not accept any __init__ arguments pass def setCopyableState(self, state): self.__dict__ = state </pre> <p>If the code on both sides of the wire import this class, then any instances of <code>MyPassByCopy</code> that are present in the arguments of a remote method call (or returned as the result of a remote method call) will be serialized and reconstituted into an equivalent instance on the other side.</p> <p>For more complicated things to do with pass-by-copy, see the documentation on <a href="copyable.xhtml">Copyable</a>. This explains the difference between <code>Copyable</code> and <code>RemoteCopy</code>, how to control the serialization and deserialization process, and how to arrange for serialization of third-party classes that are not subclasses of <code>Copyable</code>.</p> <h2>Third-party References</h2><a name="introductions" /> <p>Another new feature of Foolscap is the ability to send <code>RemoteReference</code>s to third parties. The classic scenario for this is illustrated by the <a href="http://www.erights.org/elib/capability/overview.html">three-party Granovetter diagram</a>. One party (Alice) has RemoteReferences to two other objects named Bob and Carol. She wants to share her reference to Carol with Bob, by including it in a message she sends to Bob (i.e. by using it as an argument when she invokes one of Bob's remote methods). The Foolscap code for doing this would look like:</p> <pre class="python"> bobref.callRemote("foo", intro=carolref) </pre> <p>When Bob receives this message (i.e. when his <code>remote_foo</code> method is invoked), he will discover that he's holding a fully-functional <code>RemoteReference</code> to the object named Carol<span class="footnote">and if everyone involved is using authenticated Tubs, then Foolscap offers a guarantee, in the cryptographic sense, that Bob will wind up with a reference to the same object that Alice intended. The authenticated FURLs prevent DNS-spoofing and man-in-the-middle attacks.</span>. He can start using this RemoteReference right away:</p> <pre class="python"> class Bob(foolscap.Referenceable): def remote_foo(self, intro): self.carol = intro carol.callRemote("howdy", msg="Pleased to meet you", you=intro) return carol </pre> <p>If Bob sends this <code>RemoteReference</code> back to Alice, her method will see the same <code>RemoteReference</code> that she sent to Bob. In this example, Bob sends the reference by returning it from the original <code>remote_foo</code> method call, but he could almost as easily send it in a separate method call.</p> <pre class="python"> class Alice(foolscap.Referenceable): def start(self, carol): self.carol = carol d = self.bob.callRemote("foo", intro=carol) d.addCallback(self.didFoo) def didFoo(self, result): assert result is self.carol # this will be true </pre> <p>Moreover, if Bob sends it back to <em>Carol</em> (completing the three-party round trip), Carol will see it as her original <code>Referenceable</code>.</p> <pre class="python"> class Carol(foolscap.Referenceable): def remote_howdy(self, msg, you): assert you is self # this will be true </pre> <p>In addition to this, in the four-party introduction sequence as used by the <a href="http://www.erights.org/elib/equality/grant-matcher/index.html">Grant Matcher Puzzle</a>, when a Referenceable is sent to the same destination through multiple paths, the recipient will receive the same <code>RemoteReference</code> object from both sides.</p> <p>For a <code>RemoteReference</code> to be transferrable to third-parties in this fashion, the original <code>Referenceable</code> must live in a Tub which has a working listening port, and an established base FURL. It is not necessary for the Referenceable to have been published with <code>registerReference</code> first: if it is sent over the wire before a name has been associated with it, it will be registered under a new random and unguessable name. The <code>RemoteReference</code> will contain the resulting FURL, enabling it to be sent to third parties.</p> <p>When this introduction is made, the receiving system must establish a connection with the Tub that holds the original Referenceable, and acquire its own RemoteReference. These steps must take place before the remote method can be invoked, and other method calls might arrive before they do. All subsequent method calls are queued until the one that involved the introduction is performed. Foolscap guarantees (by default) that the messages sent to a given Referenceable will be delivered in the same order. In the future there may be options to relax this guarantee, in exchange for higher performance, reduced memory consumption, multiple priority queues, limited latency, or other features. There might even be an option to turn off introductions altogether.</p> <p>Also note that enabling this capability means any of your communication peers can make you create TCP connections to hosts and port numbers of their choosing. The fact that those connections can only speak the Foolscap protocol may reduce the security risk presented, but it still lets other people be annoying.</p> </body></html>