lundi 10 septembre 2018

Regex for matching mandatory and optional characters in random order

I need a regular expression to find Text between HTML-elements via the Visual Studion Search Engine (might by C#).

What works fine in a way is this:

>\s*([\w])+\s*<

But it has to match all the following "asdf"s:

<element>asdf
  <element>asdf.</element>asdf
  <element />
asdf asdf
</element>
<element>
  asdf!
</element>

What it should NOT find is an empty space between 2 tags, this example should match NOTHING:

<element>

  <element>  </element>
</element>

What I need in particular is a regex, that matches:

  • Start with >
  • End with <
  • between those at least one word-characters (\w) is mandatory
  • a bunch of special characters are mandatory (_ . ? , ! SPACE) are mandatory
  • between start/end and the content there can be a unpredictable length of whitespace inclusive linebreak(multiline)
  • the order of the characters between start and end is absolutly random

I don't want to get matches which includes special characters without \w.

Another, which doesn't work at all is this:

>\s*((?=[\w]+)(?=[ ?=()!"_]*))\s*<

What is the correct way to accomplish my need?

Thank you so much!




Aucun commentaire:

Enregistrer un commentaire