Using Conifer and/or Webrecorder
NYARC has, in numerous instances, strategically captured responsive content (at times including content which was determined not to capture successfully via Brozzler) using the Conifer and/or Webrecorder software services. In such cases, NYARC has subsequently uploaded the WARC files generated by Conifer or Webrecorder into their corresponding NYARC collections in Archive-It.
Given that the capture and replay technologies of Conifer and Webrecorder are distinct from those of Archive-It, this strategy has not always succeeded on first pass, but in many instances the process has functioned as a viable workaround for content that would otherwise be problematic for Archive-It to capture, especially content that is maximally responsive to user input.
Several factors are important to consider when uploading external WARCs to Archive-It:
For each URL that is captured in an uploaded WARC, a new private seed must be added to that collection.
After upload, each seed may not display uploaded content for at least 24 hours.
Uploaded WARCs will only display properly in Wayback on the capture date that corresponds with the date the uploaded WARC was created. For example, a seed captured in Conifer on August 12th will usually only play back as captured in Conifer on the August 12th capture date, even if that seed was captured (by Archive-It’s crawler) on separate dates, before or after.
Capturing content behind login screens
The NYARC Web Archiving Program has used Conifer to incorporate content behind login screens, and subsequently upload the resulting WARC file directly into Archive-It. NYARC has used both Conifer tool and Webrecorder’s tool ArchiveWeb.page for this purpose.
This online exhibition at Pace Gallery prompts the user to enter an email, making this page impossible to be captured through either Standard or Brozzler crawls in Archive-It.
Using Archiveweb.page, the capture (or “recording”) was initiated after entering an email address. The resulting WARC file produced by Archiveweb.page will then be uploaded to Archive-It.
For online exhibitions behind email login screens, the following workflow has functioned for NYARC as of late 2021:
Navigate to the online exhibition URL in either Conifer or Archiveweb.page.
Before beginning the recording, enter an email address into the form.
Start the recording in Conifer or Archiveweb.page, and navigate to all content on the page to ensure full capture.
If multiple, discrete pages are to be captured, repeat the above process for other seed URLs by adding them to the same “Collection” (Conifer) or “Archive” (Archiveweb.page) of your previous capture.
Export the WARC file(s) from this Collection or Archive. One WARC file will be produced from each Collection/Archive.
In Archive-It, add all seed URLs of the pages which you just recorded to the NYARC collection in which they should appear – e.g., New York City Galleries. Use the settings Private, One-Time, and Standard for each seed.
Using the “Upload WARCs” tab for the collection, upload the WARC file(s). After 24 hours, the recorded content should display on the Wayback calendar dates corresponding to the date on which the WARCs were uploaded.