JavaScript Regex To Get Parts Of URL

I came across an extremely useful JavaScript Regex To Get Parts Of URL lately that I thought I would document as it took some searching to find. Basically I was trying to find the host of any number of combinations of urls. Meaning, if you have https://adamthings.com/ I wanted to then only get adamthings.com back.

The use case for this was I need to ensure the host domains matched. So if I got an input of say http://www.alienwarefxthemes.com/ I wanted to fail it because adamthings.com does not equal alienwarefxthemes.com.

Now I realize this regex may be a bit overkill for this situation but it was one of the few that seem to handle the majority of my cases successfully.

Now, the regex. I’ll be the first to admit I don’t understand everything that is going on as I didn’t write it and regex are not my thing.

var _host_from_url = function (url) {
var clean_url = jq.trim(url);
var match = clean_url.match(/^((http[s]?|ftp):\/\/)?\/?([^\/\.]+\.)*?([^\/\.]+\.[^:\/\s\.]{2,3}(\.[^:\/\s\.]‌​{2,3})?)(:\d+)?($|\/)([^#?\s]+)?(.*?)?(#[\w\-]+)?$/i);

return match[4];
};

Sorry for the wrapping, but I wanted it to all be on the screen without scrolling. So in my case, I was returning the 5th part of the array as it was the host name. Lets look at some outcomes.

Url To Test:
https://adamthings.com/post/2014/02/17/hello-world-angularjs/ has 10 groups:

  1. http://
  2. http
  3. www.
  4. adamthings.com
  5. /
  6. post/2014/02/17/hello-world-angularjs/

As you can see there is a lot more information you can gather from this regex. I recommend playing with it and seeing how it works for you. Here are some other examples. (Left out the blanks to conserve space but you can see the url used and what parts it found.

www.adamthings.com has 10 groups:
www.
adamthings.com

http://www.subdomain.adamthings.com has 10 groups:
http://
http
subdomain.
adamthings.com

https://adamthings.com has 10 groups:
https://
https
www.
adamthings.com

adamthings.com has 10 groups:
adamthings.com

https://adamthings.com has 10 groups:
http://
http
adamthings.com

https://adamthings.com has 10 groups:
https://
https
adamthings.com

One thought on “JavaScript Regex To Get Parts Of URL

  1. Alojamiento web

    I am looking for that i want to get all urls into array or txt from google search with using javascript.My code does the search part but i also need urls. Does anybody help me ? Thanks in advance

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

StackOverflow Profile