Extract URLs from Text | UsefulToolkit

What Is URL Extraction?

URL extraction is the process of scanning a block of text and pulling out all web addresses it contains. Whether the URLs include a full protocol like https:// or are bare domains like example.com, this tool finds and lists every one of them instantly.

This is useful for auditing links in documents, extracting references from articles, building link lists from web pages, and verifying URLs in code or configuration files. Everything runs entirely in your browser with no server processing.

How to Use This Tool

Enter Your Text

Type directly into the input editor, paste content with Ctrl+V, or upload/drag a .txt file containing text with URLs.

Toggle Unique Only

Enable the Unique only checkbox to remove duplicate URLs from the results. Disable it to see every occurrence.

Review Extracted URLs

Extracted URLs appear instantly in the output, one per line. The count and domain breakdown update in real-time as you type.

Copy or Download

Use Copy to copy all extracted URLs to clipboard, Download to save as a .txt file, or Clear to reset.

Features Explained

Protocol & Bare Domain Detection

▼

This tool extracts full URLs with any protocol scheme including http://, https://, ftp://, ws://, wss://, and compound schemes like git+ssh://. It also detects bare domains like example.com or docs.google.com/spreadsheets. Bare domain detection supports 60+ top-level domains including .com, .org, .net, .io, .dev, .ai, .co.uk, and many more.

Smart Punctuation Handling

▼

URLs at the end of sentences often have trailing punctuation like periods, commas, or closing parentheses. The tool intelligently strips trailing punctuation while preserving balanced parentheses inside URLs like Wikipedia links.

Duplicate Removal

▼

The Unique only checkbox deduplicates extracted URLs so each address appears only once in the output. This is useful when processing documents where the same link appears in multiple places.

Domain Breakdown

▼

When URLs are found, a statistics panel shows the count of URLs per domain. This gives you a quick overview of which sites are most referenced in your text.

Real-Time Extraction

▼

URLs are extracted instantly as you type or paste text. The extraction is memoized for performance, so only changes to the input or unique toggle trigger recalculation.

File Upload & Drag and Drop

▼

Upload a .txt file using the Upload button or drag and drop a text file directly onto the input area. Files up to 5MB are supported.

Who Is This Tool For?

SEO Specialists

Audit internal and external links in web pages, blog posts, and content to ensure link integrity and optimize site structure.

Content Creators

Extract all references and sources from articles, research documents, and notes to build citation lists and resource pages.

Developers

Pull URLs from log files, configuration files, API responses, and code comments for testing, migration, or debugging.

Researchers

Collect all referenced links from academic papers, reports, and web pages for literature reviews and source verification.

QA & Testers

Extract URLs from test documents and specifications to verify that all links are valid and pointing to the correct destinations.

Project Managers

Gather all resource links from meeting notes, project documents, and email threads into a clean, organized list.

Supported URL Formats

Format	Example
HTTPS with path	https://example.com/path/to/page
HTTP with path	http://www.example.com/page
FTP link	ftp://files.example.com/pub/data.zip
WebSocket	ws://socket.example.com/chat
Secure WebSocket	wss://secure.example.com/stream
Compound scheme	git+ssh://git@example.com:repo/project.git
With query string	https://example.com/search?q=test&lang=en
With fragment	https://example.com/docs#section
With port	https://example.com:8080/api
Subdomain	https://docs.google.com/spreadsheets
Country-code TLD	https://example.co.uk/page
IP address	http://192.168.1.1:3000/api
With parentheses	https://en.wikipedia.org/wiki/URL_(disambiguation)
Bare domain	google.com
Bare with www	www.github.com
Bare with path	stackoverflow.com/questions/12345
Bare subdomain	docs.google.com/spreadsheets

Bare domains are validated against 60+ known top-level domains. Any scheme:// URL is extracted (http, https, ftp, ws, wss, git+ssh, etc.). Schemes without :// like mailto:, tel:, and data: are not extracted.

Tips for Extracting URLs

Paste entire web pages

Copy the full text of a web page (Ctrl+A, Ctrl+C) and paste it here. The tool will find all URLs buried in the content, navigation, and footer sections.

Process HTML source

Paste raw HTML source code to extract all href and src URLs. The tool will pull URLs from attributes, inline styles, and script references.

Use Unique only for clean lists

When auditing links or building resource lists, enable Unique only to automatically remove duplicates and get a clean set of distinct URLs.

Check the domain breakdown

The domain breakdown panel helps you quickly see which sites are most referenced. Useful for SEO audits, link analysis, and content reviews.

Handles trailing punctuation

URLs at the end of sentences are cleaned automatically. Trailing periods, commas, and unbalanced parentheses are stripped while preserving valid URL characters.

Bare domains detected

You don't need full https:// URLs. The tool also detects bare domains like google.com and www.example.com written casually in text.

Privacy & Security

This tool runs 100% in your browser. Your text and extracted URLs are never uploaded to any server. All extraction and filtering happens locally using JavaScript.

Your input is stored only in your browser's local storage so it persists when you refresh the page. You can clear it at any time using the “Clear” button. No cookies are used, no analytics track your text content, and no third-party services have access to what you type.