PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Sunday the 19th of October, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:01:01] <JamesHarrison> So I have a MongoDB server on SSDs dealing with a pretty small dataset - maybe 50 gigs in all - and one collection is getting _really_ slow on primary key (_id) lookups
[00:01:28] <JamesHarrison> that's only 500mb of data (with about 400mb of indexes), only 110,000 items
[00:01:57] <JamesHarrison> absolutely no idea why it's going so slowly compared to all the other collections - any ideas for how I'd go about debugging this/figuring it out?
[00:03:58] <JamesHarrison> (slow in this context being about 100ms per query)
[01:16:16] <GnuBoi> How do I insert item to array if it doesn't exist but remove if it exists?
[03:05:42] <jlntlan1> hi
[03:06:04] <jlntlan1> any1 here?
[03:06:48] <jlntlan1> my mongod just crashed and when i tried to do a dump from file system, i kept on getting "if you are running mongod on the same path you should connect to that instead"
[03:14:00] <jlntlan1> any1 here?
[12:20:22] <Forest> Hello, i am trying to insert OpenStreetMap data into mongodb. I am able to store saarland area in a reasonable time-the source file is 25 MB PBF format,but whenever i try bigger file there its not completed even in 15 mins. Can anyone help me find out what the problem might be?
[12:21:23] <Forest> By reasonable time i have meant 2 minutes for saarland.
[13:28:25] <Austin___> hi all, is it possible to create a whitelist for externally connecting IP's?
[13:31:02] <t04D> Austin___, with a firewall ?
[13:33:47] <Austin___> t04D: You'll have to excuse my unfamiliarity with networking, but would a firewall be able to whitelist IP's connecting to mongodb?
[13:34:00] <Austin___> specifically to mongodb, not other services such as SSH
[13:34:22] <t04D> well yeah sure
[13:34:37] <t04D> you could blacklist all connections to the public exposed port
[13:34:51] <Austin___> ah, gotcha
[13:34:55] <t04D> then add a whitelist by allowing specific IP addresses to go through this port
[13:35:19] <Austin___> and i configure all this in iptables?
[13:35:21] <t04D> you could even use something sexy like ipset to allow having a named group for this purpose 'mongo-whitelist'
[13:35:32] <t04D> iptables + ipset for the group lets say
[13:36:00] <Austin___> ah, ok, thanks very much
[13:36:04] <t04D> np
[13:41:21] <Forest> t04D: can you help me speed up my insert please? I am struggling for hours and still 120 MB PBF file should take 25 mins,i calculated because 25 MB takes 2,5 minutes. i have tried both insert with array parameter and batch.
[13:43:11] <Forest> t04D: 150 seconds took isnerting of 2 million documents.
[13:44:08] <Forest> this is my code https://dpaste.de/gLQU Can anyone help please?
[14:52:41] <diogotozzi> 1
[16:54:07] <Forest> Hello, is it normal that inserting 20 million JSON objects takes more than 25 minutes? I have waited that long and it still didn´t get inserted. It worked with 2 million records,it took 2 and a half minute. Can anyone please help me resolve these performance issues?
[16:55:02] <dimon222> hoh, 2 mln records for half minute
[16:55:13] <dimon222> thats already feels fast enough for me
[16:55:43] <dimon222> prolly worth to check if you have enough RAM to store those
[16:57:15] <Forest> dimon222: when i insert the smaller file it onlz consumes around 1 GB of memory
[16:58:44] <Forest> dimon222: can you check my code if am i doing something wrong? i have tried both bulk unordered insert and commented out is normal insert.
[16:59:24] <dimon222> seems like it can be somehow related to indexes
[16:59:25] <dimon222> http://stackoverflow.com/questions/6783212/how-to-load-100-million-records-into-mongodb-with-scala-for-performance-testing
[16:59:48] <dimon222> >100 mln loaded in 1 hour 20 minutes // somewhere in comments
[16:59:56] <Forest> dimon222: i havent specified any indexes
[17:00:43] <dimon222> ok, i can try look at your code, but i'm not sure if its actual code related problem, unless you're doing a lot of processing
[17:01:50] <Forest> dimon222: the commented out code sovled it to chunk array o 16 MB batches and then inserted . https://dpaste.de/JxZJ
[17:04:23] <dimon222> i think its actual mongo problem, code seems to be okay
[17:05:38] <Forest> dimon222: i need to store Openstreetmap data in mongo and i can´t even for my country :( i want to create routing application that takes special user criteria for my bachelor thesis
[17:06:00] <dimon222> hoh, thats a lot of data
[17:06:29] <Forest> dimon222: yes,compressed 120 MB PBF format,uncompressed 4,5 GB
[17:06:56] <Forest> *3,5 uncompressed
[17:07:47] <Forest> my problem is i need to store those damn tags so i can distinguish between places
[17:08:56] <Forest> dimon222: and now i am helpless,i have tried almost everything to solve this and i just cant figure it out
[17:11:35] <dimon222> try to wait one hour or so, and check if it actually finishes
[17:14:13] <Forest> dimon222: ok i let it running,will you be here in an hour so we can try to figure out how we can speed it up if it finishes with no error?
[17:15:50] <dimon222> sure, but i cant really give much guarantee that I can help with that problem. It seems like simple resource usage growth
[17:19:42] <dimon222> anyway, you can try unordered bulk insert with some code for making 16MB chunks
[17:20:15] <dimon222> wait, missread, its already in there
[17:21:03] <dimon222> multithreading then?
[17:33:21] <Forest> dimon222: its still running...how can i use multithreading<
[17:34:47] <mango_> anyone here working on M202?
[18:05:23] <Forest> dimon222: hmm strange FATAL ERROR: JS Allocation failed - process out of memory and in process manager it showed he usess onlz 1,6 GB of memory
[18:05:52] <dimon222> :o
[18:07:07] <dimon222> https://www.npmjs.org/package/ofe
[18:07:09] <dimon222> try
[18:08:47] <Forest> dimon222: what should that package do<
[18:09:08] <dimon222> well, pretty much what it says now
[18:09:20] <dimon222> show full heap where allocation problem happened
[18:09:26] <dimon222> it should help you find memory leak
[18:12:25] <dimon222> there's a default limit for memory for node app, so obv you have memory leak somewhere
[18:12:48] <dimon222> you can increase memory limit, but obv its not solution, need to fix leak - http://stackoverflow.com/a/21936536/1667179
[18:13:07] <Forest> dimon222: so i just isntall that package and let it run again right?
[18:13:37] <dimon222> readme on this page on the bottom
[18:13:43] <dimon222> https://www.npmjs.org/package/ofe
[18:14:16] <dimon222> install + overwrite call
[18:29:15] <Forest> dimon222: hmm it doesnt want to compile ..\ofe.cc(5): fatal error C1083: Cannot open include file: 'sys/time.h': No such file or directory [F:\Users\Doma\node_modules\ofe\build\ofe.vcxproj]
[18:41:40] <Forest> dimon222: i am on windows and somehow he cant find that include file...
[18:50:51] <chanced> what would be the most effective way to compare two collections and get a list of records not found in the other? I have a unique id on each, but I'd rather not have to pull all of them and do it server-side
[19:34:25] <mango_> I can't seem to fork a second mongo server
[19:34:26] <mango_> http://www.macrumors.com/2014/10/19/mac-mini-2014-benchmark/
[19:34:31] <mango_> wrong link
[19:34:49] <mango_> basically changed port, log file,
[21:08:49] <syadnom> hi all. i can't find information on how to have mongodb accept writes to a secondary in a replication set. Maybe a redirect to the primary or something.
[21:10:05] <syadnom> I have an app (3rd party) that writes to mongodb and I need to create a distributed set of these with mongodb replicating. There are some obvious tutorials on this part, but now I need to have devices that connect to one of the secondaries be able to write, as if the secondary were writable. Is there a write proxy etc etc to allow this?
[21:34:16] <joannac> syadnom: no. connect to the primary if you need to write
[21:35:19] <syadnom> unforuntaly I can't change the config of the software (unifi controller from uniquiti
[21:38:00] <joannac> How are you connecting to mongodb then? driver?
[23:16:48] <uehtesham90> hi, i am using mongodb with flask. i store records into my mongodb database from an API which gives its data in json. i want to run my script weekly to verify if there is new data in the API data. and if there is new data, i want to insert only the new records. does that mean i have to check each record that i get and see if its already in the database and
[23:16:48] <uehtesham90> if not then insert the new record? or is there a better way to do that?