
Custom Extraction – Scrape any data from the HTML of a URL using XPath, CSS Path selectors or regex. #Download free resume templates for mac code#
Custom Source Code Search – Find anything you want in the source code of a website! Whether that’s Google Analytics code, specific text, or code etc. Custom HTTP Headers – Supply any header value in a request, from Accept-Language to cookie. User-Agent Switcher – Crawl as Googlebot, Bingbot, Yahoo! Slurp, mobile user-agents or your own custom UA. Images over 100kb, missing alt text, alt text over 100 characters. Images – All URLs with the image link & all images from a given page. AJAX – Select to obey Google’s now deprecated AJAX Crawling Scheme. Rendering – Crawl JavaScript frameworks like AngularJS and React, by crawling the rendered HTML after JavaScript has executed. Outlinks – View all pages a URL links out to, as well as resources. Inlinks – View all pages linking to a URL, the anchor text and whether the link is follow or nofollow. hreflang Attributes – Audit missing confirmation links, inconsistent & incorrect languages codes, non canonical hreflang and more. Redirect Chains – Discover redirect chains and loops. Follow & Nofollow – View meta nofollow, and nofollow link attributes. Pagination – View rel=“next” and rel=“prev” attributes. X-Robots-Tag – See directives issued via the HTTP Headder. Canonicals – Link elements & canonical HTTP headers. Meta Refresh – Including target page and time delay. Meta Robots – Index, noindex, follow, nofollow, noarchive, nosnippet etc. H2 – Missing, duplicate, long, short or multiple headings. H1 – Missing, duplicate, long, short or multiple headings.
Word Count – Analyse the number of words on every page.Crawl Depth – View how deep a URL is within a website’s architecture.Last-Modified Header – View the last modified date in the HTTP header.Response Time – View how long pages take to respond to requests.Meta Keywords – Mainly for reference or regional search engines, as they are not used by Google, Bing or Yahoo.Meta Description – Missing, duplicate, long, short or multiple descriptions.Page Titles – Missing, duplicate, long, short or multiple title elements.Duplicate Pages – Discover exact and near duplicate pages using advanced algorithmic checks.
URI Issues – Non ASCII characters, underscores, uppercase characters, parameters, or long URLs. Security – Discover insecure pages, mixed content, insecure forms, missing security headers and more. External Links – View all external links, their status codes and source pages. Blocked Resources – View & audit blocked resources in rendering mode.
Blocked URLs – View & audit URLs disallowed by the robots.txt protocol.Redirects – Permanent, temporary, JavaScript redirects & meta refreshes.Errors – Client errors such as broken links & server errors (No responses, 4XX client & 5XX server errors).