Web applications

As in the case of ‘static’ elements like images and PDFs, the web crawler can deem long lists of the requisite pieces of highly dynamic and user-responsive web applications to be out of scope when they are hosted among domains sufficiently distant from the crawl’s seed URL(s). Regardless of the completeness of a web application’s capture, it will furthermore fail to render the desired informational content that it is designed to contain when the origin of that content itself is likewise remote. In either case, the problem will manifest in an archival environment, be it Wayback view or Proxy mode, in the form of applications so unresponsive and/or partially complete as to be functionally useless.

NYARC has had varying levels of success capturing content completely from among the specific digital publishing application hosts below. Ultimately, however, externally hosted Flash-based or otherwise responsive applications that fail to capture or render in an archival environment are beyond NYARC’s capacity to improve and require intervention by Archive-It. To determine best practices for the content within its specific scope, it is in the meantime incumbent upon NYARC to refine and report the results of the following experimental scoping rules specific to each service:

For directions specific to video-hosting web applications, see Video.

When these applications are instead hosted among domains or subdomains internal to the crawled seed(s), refer to the directions provided for mitigating problems encountered while crawling responsive content.