Python load big file




















Leave a Reply Cancel reply Your email address will not be published. Leave this field empty. Exact matches only. Search in title. Search in content. Search in excerpt. I tried implementing David Eyk's suggestion as below snippet , trying to manually garbage collect at every , lines,. Comparing the previous memory profiling output with the above, str has reduced 45 objects bytes , tuple has reduced 25 objects bytes and dict no owner though no object change, it has reduced bytes of the memory size.

All other objects are the same including dict of main. The total number of objects are The small reduction in object 0. Very strange. If you realize, though I know the place the memory starvation is occurring, I am yet able to narrow down which one causing the bleed. Thanks to all the pointers, using this opportunity to learn much on python as I ain't an expert. Appreciate your time taken to assist me. I continued debugging using the minimalistic code per martineau's suggestion.

I began to add codes in the process function and observe the memory bleeding. Discussed this issue with my friend Gustavo Carneiro, he suggested to use slot to replace dict. This works because files are iterators yielding 1 line at a time until there are no more lines to yield. The fileinput module will let you read it line by line without loading the entire file into memory. The simple solution bears official mention though HerrKaputt mentioned this in a comment.

This is simple, pythonic, and understandable. If you don't understand how with works just use this. As the other poster mentioned this does not create a large list like file. If the file is JSON, XML, CSV, genomics or any other well-known format, there are specialized readers which use C code directly and are far more optimized for both speed and memory than parsing in native Python - avoid parsing it natively whenever possible.

Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. Asked 9 years, 4 months ago. Active 7 years, 8 months ago. Viewed 12k times. Learn more. Asked 9 years, 8 months ago. Active 4 months ago. Viewed k times.

Tom Carrick Tom Carrick 5, 12 12 gold badges 50 50 silver badges 76 76 bronze badges. Similar if not the same question: stackoverflow. The issue is that if the JSON file is one giant list for example , then parsing it into Python wouldn't make much sense without doing it all at once. I guess your best bet is to find a module that handles JSON like SAX and gives you events for starting arrays and stuff, rather than giving you objects.

Unfortunately, that doesn't exist in the standard library. Well, I kind of want to read it in all at once. One of my potential plans is to go through it once and stick everything in a database so I can access it more efficiently. If you can't fit the entire file as text into memory, I sincerely doubt you'll fit the entire file as Python objects into memory.

If you want to put it in a database, my answer could be helpful. For any non-trivial task processing of json files such sizes can easy take weeks or months. Change Language. Related Articles. Table of Contents. Improve Article. Save Article. Like Article. Last Updated : 05 Apr, Attention geek!



0コメント

  • 1000 / 1000