[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Editing large txt file using regex / notepad++

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 9
Thread images: 2

File: regex-example.png (35KB, 931x346px) Image search: [Google]
regex-example.png
35KB, 931x346px
Ok, so I want to find/replace a multiline string starting with a certain word and ending with a certain word, the number of lines for each string are unknown. I can't seem to figure this one out. Here's exactly what I want deleted;

<page>
<title>List of animated television series</title>
<ns>0</ns>
<id>2139</id>
<revision>
<id>796987852</id>
<parentid>792553539</parentid>
<timestamp>2017-08-24T08:20:43Z</timestamp>
<contributor>
<username>Michig</username>
<id>1779282</id>
</contributor>
<comment>/* Other lists */ removed lists include compilation series of theatrical shorts such as ''[[The Bugs Bunny Show]]'' since they often feature some new wrap-around anim
*[[List of American animated television series]]
*[[List of British animated television series]]
*[[List of Canadian animated television series]]
*[[List of French animated television series]]
*[[List of Italian animated television series]]
*[[List of Flash animated television series]]
*[[List of animated television programs with LGBT characters]]

==External links==
* [[:ja:?????????????|?????????????]] - Lists of Japanese animated television series on [[Japanese Wikipedia]]
* [http://www.toonopedia.com/ Don Markstein's Toonopedia] – Very large index page
* [http://www.bcdb.com/ The Big Cartoon Database]
* [http://80scartoons.net/toons/ 80sCartoons] – Nostalgia for those who grew up in the 1980s in [[Western world|the West]]
* [http://en.accessup.org/anime/e_anime_date.html Anime sorted by release date, JP Works DB]

{{Animation}}
{{Lists of television programs by genre}}

{{DEFAULTSORT:List Of Animated Television Series}}
[[Category:Animated television series| ]]
[[Category:Lists of television series by genre|Animated]]
[[Category:Lists of animated television series]]</text>
<sha1>ny8cql5skgp2wkochqx7hihj6l5zidh</sha1>
</revision>
</page>


The string starts with " <page>
<title>List of " and end with "</page>"
>>
Any help would be helpful.....derp
>>
What kind of regex? You want to delete the whole thing?

<page>.*</page>

newline characters are in SOME types of regexes, but not other types.
>>
1. get written material about regexes
2. read them
3. ???
4. profit!!
>>
>>62350051
Don't try to parse HTML with regex, it's a stupid idea.

The answer is
s/^<page>\n<title>List of(.|\n)*<\/page>//gm
>>
Use xml parser (also email validation with such regex is wrong).
Generally problem with regex in such way is escaping and hungry searches.
>>
>>62351046

Can you give me some more info on this, like where to download such a program (XML Parser?) and what commands to use to delete all "List of...." titles
>>
File: this.png (84KB, 745x697px)
this.png
84KB, 745x697px
>>62350971
>>
>>62351360


Ugghhh, ok, I'll have to research how to use XML parsers. thanks anyways, /thread
Thread posts: 9
Thread images: 2


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.