Parsing HTML Forms ================== Sometimes in functional tests, information from a generated form must be extracted in order to re-submit it as part of a subsequent request. The `zope.testing.formparser` module can be used for this purpose. The scanner is implemented using the `FormParser` class. The constructor arguments are the page data containing the form and (optionally) the URL from which the page was retrieved: >>> import zope.testing.formparser >>> page_text = '''\ ... <html><body> ... <form name="form1" action="/cgi-bin/foobar.py" method="POST"> ... <input type="hidden" name="f1" value="today" /> ... <input type="submit" name="do-it-now" value="Go for it!" /> ... <input type="IMAGE" name="not-really" value="Don't." ... src="dont.png" /> ... <select name="pick-two" size="3" multiple> ... <option value="one" selected>First</option> ... <option value="two" label="Second">Another</option> ... <optgroup> ... <option value="three">Third</option> ... <option selected="selected">Fourth</option> ... </optgroup> ... </select> ... </form> ... ... Just for fun, a second form, after specifying a base: ... <base href="http://www.example.com/base/" /> ... <form action = 'sproing/sprung.html' enctype="multipart/form"> ... <textarea name="sometext" rows="5">Some text.</textarea> ... <input type="Image" name="action" value="Do something." ... src="else.png" /> ... <input type="text" value="" name="multi" size="2" /> ... <input type="text" value="" name="multi" size="3" /> ... </form> ... </body></html> ... ''' >>> parser = zope.testing.formparser.FormParser(page_text) >>> forms = parser.parse() >>> len(forms) 2 >>> forms.form1 is forms[0] True >>> forms.form1 is forms[1] False More often, the `parse()` convenience function is all that's needed: >>> forms = zope.testing.formparser.parse( ... page_text, "http://cgi.example.com/somewhere/form.html") >>> len(forms) 2 >>> forms.form1 is forms[0] True >>> forms.form1 is forms[1] False Once we have the form we're interested in, we can check form attributes and individual field values: >>> form = forms.form1 >>> form.enctype 'application/x-www-form-urlencoded' >>> form.method 'post' >>> keys = form.keys() >>> keys.sort() >>> keys ['do-it-now', 'f1', 'not-really', 'pick-two'] >>> not_really = form["not-really"] >>> not_really.type 'image' >>> not_really.value "Don't." >>> not_really.readonly False >>> not_really.disabled False Note that relative URLs are converted to absolute URLs based on the ``<base>`` element (if present) or using the base passed in to the constructor. >>> form.action 'http://cgi.example.com/cgi-bin/foobar.py' >>> not_really.src 'http://cgi.example.com/somewhere/dont.png' >>> forms[1].action 'http://www.example.com/base/sproing/sprung.html' >>> forms[1]["action"].src 'http://www.example.com/base/else.png' Fields which are repeated are reported as lists of objects that represent each instance of the field:: >>> field = forms[1]["multi"] >>> type(field) <type 'list'> >>> [o.value for o in field] ['', ''] >>> [o.size for o in field] [2, 3] The ``<textarea>`` element provides some additional attributes: >>> ta = forms[1]["sometext"] >>> print ta.rows 5 >>> print ta.cols None >>> ta.value 'Some text.' The ``<select>`` element provides access to the options as well: >>> select = form["pick-two"] >>> select.multiple True >>> select.size 3 >>> select.type 'select' >>> select.value ['one', 'Fourth'] >>> options = select.options >>> len(options) 4 >>> [opt.label for opt in options] ['First', 'Second', 'Third', 'Fourth'] >>> [opt.value for opt in options] ['one', 'two', 'three', 'Fourth']