Editing large txt file using regex / notepad++

Thread replies: 9
Thread images: 2

Captain Jew
Editing large txt file using regex / notepad++ 2017-09-10 03:54:08 Post No.
[Report] Image search: [Google]

File: regex-example.png (35KB, 931x346px) Image search: [Google]

Editing large txt file using regex / notepad++ Captain Jew 2017-09-10 03:54:08 Post No. [Report]

Ok, so I want to find/replace a multiline string starting with a certain word and ending with a certain word, the number of lines for each string are unknown. I can't seem to figure this one out. Here's exactly what I want deleted;

<page>
<title>List of animated television series</title>
<ns>0</ns>
<id>2139</id>
<revision>
<id>796987852</id>
<parentid>792553539</parentid>
<timestamp>2017-08-24T08:20:43Z</timestamp>
<contributor>
<username>Michig</username>
<id>1779282</id>
</contributor>
<comment>/* Other lists */ removed lists include compilation series of theatrical shorts such as ''[[The Bugs Bunny Show]]'' since they often feature some new wrap-around anim
*[[List of American animated television series]]
*[[List of British animated television series]]
*[[List of Canadian animated television series]]
*[[List of French animated television series]]
*[[List of Italian animated television series]]
*[[List of Flash animated television series]]
*[[List of animated television programs with LGBT characters]]

==External links==
* [[:ja:?????????????|?????????????]] - Lists of Japanese animated television series on [[Japanese Wikipedia]]
* [http://www.toonopedia.com/ Don Markstein's Toonopedia] – Very large index page
* [http://www.bcdb.com/ The Big Cartoon Database]
* [http://80scartoons.net/toons/ 80sCartoons] – Nostalgia for those who grew up in the 1980s in [[Western world|the West]]
* [http://en.accessup.org/anime/e_anime_date.html Anime sorted by release date, JP Works DB]

{{Animation}}
{{Lists of television programs by genre}}

{{DEFAULTSORT:List Of Animated Television Series}}
[[Category:Animated television series| ]]
[[Category:Lists of television series by genre|Animated]]
[[Category:Lists of animated television series]]</text>
<sha1>ny8cql5skgp2wkochqx7hihj6l5zidh</sha1>
</revision>
</page>

The string starts with " <page>
<title>List of " and end with "</page>"

Captain Jew 2017-09-10 03:56:17 Post No.62350081
[Report]

Captain Jew 2017-09-10 03:56:17 Post No.62350081 [Report]

Any help would be helpful.....derp

Anonymous 2017-09-10 04:12:42 Post No.62350331
[Report]

Anonymous 2017-09-10 04:12:42 Post No.62350331 [Report]

What kind of regex? You want to delete the whole thing?

<page>.*</page>

newline characters are in SOME types of regexes, but not other types.

Anonymous 2017-09-10 04:48:36 Post No.62350808
[Report]

Anonymous 2017-09-10 04:48:36 Post No.62350808 [Report]

1. get written material about regexes
2. read them
3. ???
4. profit!!

Anonymous 2017-09-10 04:59:14 Post No.62350971
[Report]

Anonymous 2017-09-10 04:59:14 Post No.62350971 [Report]

>>62350051
Don't try to parse HTML with regex, it's a stupid idea.

The answer is
s/^<page>\n<title>List of(.|\n)*<\/page>//gm

Anonymous 2017-09-10 05:04:40 Post No.62351046
[Report]

Anonymous 2017-09-10 05:04:40 Post No.62351046 [Report]

Use xml parser (also email validation with such regex is wrong).
Generally problem with regex in such way is escaping and hungry searches.

Captain Jew 2017-09-10 05:09:50 Post No.62351129
[Report]

Captain Jew 2017-09-10 05:09:50 Post No.62351129 [Report]

>>62351046

Can you give me some more info on this, like where to download such a program (XML Parser?) and what commands to use to delete all "List of...." titles

Anonymous 2017-09-10 05:26:49 Post No.62351360
[Report] Image search: [Google]

Anonymous 2017-09-10 05:26:49 Post No.62351360 [Report]

File: this.png (84KB, 745x697px)

84KB, 745x697px

>>62350971

Captain Jew 2017-09-10 05:31:32 Post No.62351442
[Report]

Captain Jew 2017-09-10 05:31:32 Post No.62351442 [Report]

>>62351360

Ugghhh, ok, I'll have to research how to use XML parsers. thanks anyways, /thread

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible. Read more on this topic here - https://archived.moe/talk/thread/1694/

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/