Roblox Requests

HTML/XML Parsing

Parsing and navigating HTML and XML documents with Roblox Requests.

Roblox Requests

Roblox Requests allows you to generate a DOM from HTML or XML directly from your response. This allows you to make use of many useful APIs that use the XML format, as well as fetch data from any webpage.

DOM From Response Content

HTML

In the Quick Start, we learned to parse JSON response content using the :json() method:

local r = http.get("https://httpbin.org/get")
local data = r:json()

Parsing an HTML or XML body is just as easy. To generate a DOM from an HTML response, we’ll use the :html() method.

local r = http.get("https://github.com/")
local root = r:html()

Requests will check the response’s Content-Type before parsing it, and make sure the response is actually HTML. If the content type doesn’t match, it’ll throw an error:

local r = http.get("https://httpbin.org/get")
local root = r:html()
-- error: [http] Response is not specified as HTML.

If a server isn’t specifying the content type correctly, you can ignore this with the optional ignore_content_type argument:

local root = r:html(true)  -- set ignore_content_type to true

XML

Parsing XML content is just as easy, using the :xml() method. Similar to :html(), this will throw an error if the content-type is not XML based.

local r = http.get("https://www.w3schools.com/xml/note.xml")
local my_xml = r:xml()

Parse From String

Requests also allows you to generate a DOM from a string”

local html = http.parse_html("<html> </html>")
local xml = http.parse_xml("<xml> </xml>")

You can find specific elements in a document using a subset of jQuery’s selector strings:

local elements = root:select(selectorstring)

Or in shorthand:

local elements = root(selectorstring)

This will return a list of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:

for _, element in ipairs(elements) do
    print(element.name)
    local subs = element(subselectorstring)
    for _, sub in ipairs(subs) do
        print(sub.name)
    end
end

Selectors

Supported selectors are a subset of jQuery’s selectors:

Selectors can be combined; e.g. ".class:not([attribute]) element.class"

Elements

All tree elements have the following properties and methods:

Properties

Some element properties are specified as Sets. You must call :to_list() on these properties to get their values.

Methods

Source for the parser can be found at the project’s GitHub.

local document = http.get("https://github.com"):html()

for _, url in ipairs(document:absolute_links()) do
    print(url)
end

-- https://github.com/#start-of-content
-- https://help.github.com/articles/supported-browsers
-- https://github.com/
-- https://github.com/join?ref_cta=Sign+up&amp;ref_loc=header+logged+out&amp;ref_page=%2F&amp;source=header-home
-- ...

Powered by Vadim A. Misbakh-Soloviov’s lua-htmlparser.

>