Html Agility Pack Get All Elements by Class

Html Agility Pack is a power tool for parsing through document source. I had a need where I needed to parse a document using html agility pack to get all elements by class name. It really is a simple function with Html Agility Pack but getting the syntax right was the difficult part for me.

Here is my use case:

I need to select all elements that have the class float on them. I started with this query which was working for just div tags.

var findclasses = _doc.DocumentNode.Descendants("div").Where(d => 
    d.Attributes.Contains("class") && d.Attributes["class"].Contains("float")

What this does is it takes your _doc and finds all divs within where they have an attribute names class and then goes one step farther to ensure that class contains float

Whew what a mouth full. Lets look at what an example node it would select.

<div class="className float anotherclassName">

So now, how do we get ALL ELEMENTS in the doc that contain the same class of float. If we take a look back at our HTML Agility Pack query there is one small change we can make to the .Descendants portion that will return all elements by class. This may seem simple, but took quite awhile to come to, if you simply leave .Descendants empty, it will return all elements. Look below:

var findclasses = _doc.DocumentNode.Descendants().Where(d => 
    d.Attributes.Contains("class") && d.Attributes["class"].Contains("float")

The query above will return ALL ELEMENTS that include a class with the name of float.

Documented based off my question on stackoverflow here: Html Agility Pack get all elements by class

StackOverflow Profile