Money is the most live materia in the world.
“No Sex or Fantasy Code Forbidden.” #1ZE
The Biggest HTML CD Example
in the NET.
A HTML Blue-ray CD is a Media with (my) 20GB local Backups all files and internet content made over the last 20 years with every internet extension. First: I use here BBEdit or Notepad++, and Compare Merge 2 (Mac) and at least HTTrack Windows.
First you should see in every level what content in local backups You want to safe/crawl? Here we go/start: First to say we start with the sitemap (you also need):
First you need to make a compare with the backup folder and some other folder, and you need then to show only files with .htm*
click on, on the right and on similar. After this step that takes min. 5 Minutes, you need to copy the right pathes of all
(for example 75000 files) the pathes into BBEdit.
Then BBedit is better in cache for big files you go to update the urls:
Delete the local folders, with the deleten User/FOLDER/.. simply all at first that is not into your website. And replace it with CODE: <a href="
. Now you should also replace .html with .html">STD</a>
.
Now Secondly you need to replace .htm simply use the above code for and with .htm
but add a /n
after it all so that only when it is ending with htm to the next line also is replaced.
Good done now we have a sitemap. PS: The sitemap html file needs in addition a: <!DOCTYPE html> … </html>
Strukture inserted. How mentioned we use HTTrack. First three to say: If you are (here) are crawling a local Backup You should really log off the
internet, because then you download only from the local backup with lightning speed.
So be sure to close the internet, bacause if not you end up with 500 crawled files per hour.
THE HTTrack Setting: Something first you should look to your file structure with Trashme or a folder size software and maybe look for inside folders that are much too big, so. Here: Please make this second url to the first line and then the adress, then.
The Programm: If you need to have the same setting again, open the project save, and click back again by renaiming the Project. Make a new project for example BHTMLCD and click next, Then you insert the subfolders with high files amount and then after the sitemap/index url you want
(you safed at the beginning) for the sitemap crawl, by the way you should make a sitemap for the other folders too. If you already downloaded sth. (so click an crawl interupt) or not then choose (for your empty project) download.
Preferences: this is the Scan Rules: +*.htm, +*.html, +*.mp3, +*.wmv, +*.ico, +*.xml, +*.xsl, +*.txt, +*.js, +*.pdf, +*.rss, +*.ror, +*.gif, +*.jpg, +*.png, -*=*, -*?* so is this!
Now we change Mirroring depth (Caution before mass loadings) here: We put by hand in the first field: Maximum levels: 3 or better 888. External Depth: 0. Four times nothing and one time nothing. (All by hand) 2.500.000 Byte Max transfer
rate and max connections 1000. Then important: 999.999.999 further number links.
Flow Controll chase 8 spiders, Timeout 3 and Retries 1, and nothing other. In Build choose: no error pages, no errors no external links and hide passwords and hide querrystrings. So done; now choose at the dropdown all files only in web gadget, because this is a HTML CD. On Spyder simply choose no
robots.txt rules. Practic, to make a logfile in log (attention mega file). This was all changes.
PS: The url should be a local fileZ:///C:/ Adress because it is local, but this is not mandatory. So click on next and then on finish and wait about the crawl so a few hours for your 17.5GB super thing. Service Support. Thanks,
the "Boss".
Notes: If your iTunes on mac is not sorting the songs right: (On a 500gb Harddrive Mac) Simply buy a external m.2 closure and a small m.2; look that you have then 100gb free space; and your iTunes media collection is sorted and working again.
2: I am using a NVIDIA Shield Device (4K) because my Yamaha Receiver is then working in better Dolby Surround+ .
The story of a man as: 6*6+12=48 or Atomar Raster.
Ability: A (love) pairing concept. Open the Door.