SMAT is now Open Measures! Read more about our rebrand here.

April 10, 2024

Product Release: Crawl Requests

With Crawl Requests, researchers can direct our crawlers at the keywords, profiles, or Telegram channels that they’d like to track.

TLDR

Background

crawl requests dashboard
In the Crawl Requests UI, users can view all crawl requests, manage request statuses, add new requests, and monitor trends.

The research community needs scraping for unique keywords, user or group profiles, and Telegram channels for project-specific investigations. Historically, we met that need with request-based manual configuration and our API. Today, we’re happy to share the launch of the Crawl Request dashboard, where users can take control of Open Measures’ scope of coverage themselves.

Here’s a high level overview of how this new tooling works:

crawl requests diagram
Crawl Requests works in five steps: Partners provide targets for the crawl (keywords, profiles, or channels) (1). Open Measures’ crawlers pick up jobs (2) and run collection for data using a source’s native search interface (3). That data is then collected and stored (4). Partners can then access the collected data on the Open Measures platform via API, Research Dashboard, etc (5).

With this new feature, users will have better control over Open Measures’ crawlers and the ability to target thorough crawls of subsets comprised of sources we collect. Depending on the dataset, we will either enumerate through a list of targeted keywords, user profiles, or channels.

The * in the above diagram represents the other crawling processes that Open Measures maintains to collect data from our sources. Crawl Requests are a standalone collection stack that runs in parallel to these default data collection systems.

Keywords and Profiles

crawl requests keywords
Added profiles and keywords can be added to requests for each source.

Most sites that Open Measures collects from have native search bars. When a crawl request is made for a keyword, Open Measures’ crawlers run a search for that keyword using the dataset’s search bar, enumerate all the results, and save the collected data.

crawl requests feed image
Example search interface on Truth Social. In this use case, the keyword request is for “trump 2024”. Open Measures’ crawlers then emulate a search through Truth Social’s own search engine, always ensuring the collected set is comprised of data available directly on and from the site.

Crawl Requests allows Open Measures users to request keyword crawling on a per source basis.

Telegram Channels

add channels
Rhe addition of new Telegram channels only takes the touch of a button.

Crawl Requests also allows partners to submit Telegram channels. When Open Measures receives a request to crawl Telegram data, we back-crawl the channel all the way to the first message before monitoring all data coming in live going forward.

Conclusion

Open Measures’ Crawl Requests UI is now available to partnered users and organizations. This innovative new application allows users to take control of our crawlers, pointing them at unique keywords, profiles, and channels that their researchers care about most.


Identify disinformation and extremism with the Open Measures platform.

Organizations use Open Measures’ tooling every day to track trends related to networks of influence, coordinated harassment campaigns, and state-backed info ops. Click here to book a demo.