Hey guys,
i’m currently trying to filter images from a list of websites. However, i’m stuck with the RegExp-Node, because my regular expression just won’t do it the right way.
This
<img .*? src=["'](http://.*?[\.jpg]("'](http://.*?[\.jpg)+)["'>\s]("'>\s)+
should get me all the *.jps of the sourcecode, which partly works, but say 20% of the filtered strings look like this afterwards:
http://bla.com/bla/somepicture.gif" alt="bla"></a></tag><whatever>
I just can’t find the reason why 1) some “gif"s get through and 2) why it doesn’t cut them at the first “>”, “””, or space. I’ve tried everything from ^, using the dollar sign and the “.*?”, but I just can’t get it right.
Anyway, does anyone have a hint for me? Thanks in advance! :)