Sand Ark
Design Lab

Logo



XScrape is a free web page spider scraper application that extracts strings based on a regular expression. It traverses all links from a starting URL. Use the power of search engines targeted key words and XScrape to gather data from potential customers, business rivals, and perform research on the web retrieving esoteric information. Amuse yourself by extracting email addresses, web Meta data, or any phrase you can dream of using regular expressions.

XScrape
* As is Software - no warranty of any kind is expressed or implied. XScrape does not include spy-ware, trickler, or advertisement features.


Download


Instruction

Click on the download button and save the archived file. Extract the executable and double click to start. XScrape requires Microsoft .NET Framework 3.0 that comes standard on Windows Vista. For Windows XP users you can download the .NET Framework from Microsoft.

Field Definition

URL - The starting URL.

Cookie - Enter the session cookie if the URL is accessible only from a specific state such as after the user is logged in. To retrieve the cookie information enter, "javascript:document.write(document.cookie);" at the browser address bar. Copy and paste the output into the cookie text field.

User-Agent - Enter the browser user-agent you want to emulate. By default the string is set to Chrome browser. Additional user-agents can be found on Wikipedia.

Inclusive URLs - The list of comma separated URLs to traverse. Links to URLs that do not have a substring within this list will not be navigated. Here are a few examples of valid entries: "yahoo.com", "mail.yahoo.com", "www.yahoo.com, www.google.com, microsoft.com", or any valid text.

Excluisve URLs - The list of comma separated URLs to exclude from traversing. As an example, you may have "yahoo.com" in the inclusive list and "mail.yahoo.com" in the exclusive list. This tells XScrape to navigate to "yahoo.com" URLs only but skip links that have "mail.yahoo.com".

Regular Expression - A string pattern that describes a group of text. A pattern like "[\w]+@[\w]+\.[\w]{2,3}" rudimentary describes an email address pattern. For additional reference click here.

Contact

Send us your questions, suggestions, and feedback to Support@SandArk.com

Copyright © 2023 Sand Ark. All Rights Reserved.