UT

Extract URLs from Text

Extract all URLs from any text.

What Is URL Extraction?

URL extraction is the process of scanning a block of text and pulling out all web addresses it contains. Whether the URLs include a full protocol like https:// or are bare domains like example.com, this tool finds and lists every one of them instantly.

This is useful for auditing links in documents, extracting references from articles, building link lists from web pages, and verifying URLs in code or configuration files. Everything runs entirely in your browser with no server processing.

How to Use This Tool

1

Enter Your Text

Type directly into the input editor, paste content with Ctrl+V, or upload/drag a .txt file containing text with URLs.

2

Toggle Unique Only

Enable the Unique only checkbox to remove duplicate URLs from the results. Disable it to see every occurrence.

3

Review Extracted URLs

Extracted URLs appear instantly in the output, one per line. The count and domain breakdown update in real-time as you type.

4

Copy or Download

Use Copy to copy all extracted URLs to clipboard, Download to save as a .txt file, or Clear to reset.

Features Explained

Protocol & Bare Domain Detection

This tool extracts full URLs with any protocol scheme including http://, https://, ftp://, ws://, wss://, and compound schemes like git+ssh://. It also detects bare domains like example.com or docs.google.com/spreadsheets. Bare domain detection supports 60+ top-level domains including .com, .org, .net, .io, .dev, .ai, .co.uk, and many more.

Smart Punctuation Handling

URLs at the end of sentences often have trailing punctuation like periods, commas, or closing parentheses. The tool intelligently strips trailing punctuation while preserving balanced parentheses inside URLs like Wikipedia links.

Duplicate Removal

The Unique only checkbox deduplicates extracted URLs so each address appears only once in the output. This is useful when processing documents where the same link appears in multiple places.

Domain Breakdown

When URLs are found, a statistics panel shows the count of URLs per domain. This gives you a quick overview of which sites are most referenced in your text.

Real-Time Extraction

URLs are extracted instantly as you type or paste text. The extraction is memoized for performance, so only changes to the input or unique toggle trigger recalculation.

File Upload & Drag and Drop

Upload a .txt file using the Upload button or drag and drop a text file directly onto the input area. Files up to 5MB are supported.

Who Is This Tool For?

SEO Specialists

Audit internal and external links in web pages, blog posts, and content to ensure link integrity and optimize site structure.

Content Creators

Extract all references and sources from articles, research documents, and notes to build citation lists and resource pages.

Developers

Pull URLs from log files, configuration files, API responses, and code comments for testing, migration, or debugging.

Researchers

Collect all referenced links from academic papers, reports, and web pages for literature reviews and source verification.

QA & Testers

Extract URLs from test documents and specifications to verify that all links are valid and pointing to the correct destinations.

Project Managers

Gather all resource links from meeting notes, project documents, and email threads into a clean, organized list.

Supported URL Formats

FormatExample
HTTPS with pathhttps://example.com/path/to/page
HTTP with pathhttp://www.example.com/page
FTP linkftp://files.example.com/pub/data.zip
WebSocketws://socket.example.com/chat
Secure WebSocketwss://secure.example.com/stream
Compound schemegit+ssh://git@example.com:repo/project.git
With query stringhttps://example.com/search?q=test&lang=en
With fragmenthttps://example.com/docs#section
With porthttps://example.com:8080/api
Subdomainhttps://docs.google.com/spreadsheets
Country-code TLDhttps://example.co.uk/page
IP addresshttp://192.168.1.1:3000/api
With parentheseshttps://en.wikipedia.org/wiki/URL_(disambiguation)
Bare domaingoogle.com
Bare with wwwwww.github.com
Bare with pathstackoverflow.com/questions/12345
Bare subdomaindocs.google.com/spreadsheets

Bare domains are validated against 60+ known top-level domains. Any scheme:// URL is extracted (http, https, ftp, ws, wss, git+ssh, etc.). Schemes without :// like mailto:, tel:, and data: are not extracted.

Tips for Extracting URLs

Paste entire web pages

Copy the full text of a web page (Ctrl+A, Ctrl+C) and paste it here. The tool will find all URLs buried in the content, navigation, and footer sections.

Process HTML source

Paste raw HTML source code to extract all href and src URLs. The tool will pull URLs from attributes, inline styles, and script references.

Use Unique only for clean lists

When auditing links or building resource lists, enable Unique only to automatically remove duplicates and get a clean set of distinct URLs.

Check the domain breakdown

The domain breakdown panel helps you quickly see which sites are most referenced. Useful for SEO audits, link analysis, and content reviews.

Handles trailing punctuation

URLs at the end of sentences are cleaned automatically. Trailing periods, commas, and unbalanced parentheses are stripped while preserving valid URL characters.

Bare domains detected

You don't need full https:// URLs. The tool also detects bare domains like google.com and www.example.com written casually in text.

Privacy & Security

This tool runs 100% in your browser. Your text and extracted URLs are never uploaded to any server. All extraction and filtering happens locally using JavaScript.

Your input is stored only in your browser's local storage so it persists when you refresh the page. You can clear it at any time using the “Clear” button. No cookies are used, no analytics track your text content, and no third-party services have access to what you type.