PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 18th of August, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[06:09:13] <therealssj> in pymongo how do we set document level ttl?
[06:09:37] <cheeser> ttls live at the index level
[06:11:54] <therealssj> @cheeser i defined the index for the collection as self.db.create_index('expire', expireAfterSeconds=3600), self.db has the collection I am storing the documents in
[06:12:30] <cheeser> you need to define which field to index
[06:12:45] <therealssj> isnt that what ‘expire’ is ?
[06:15:11] <cheeser> that's an option on the index.
[06:15:33] <cheeser> oh! maybe you're right actually. (i don't acutally use pymongo)
[06:15:41] <cheeser> the java api is a bit more explicit
[06:16:01] <joannac> therealssj: did the command you pasted above work?
[06:16:09] <joannac> if not, what was the error?
[06:16:13] <therealssj> joannac: yes it created the index
[06:16:25] <therealssj> the documents dont expire
[06:18:08] <joannac> can you pastebin 1. list of indexes for that collection 2. a document you think should have expired, but didn't
[06:18:18] <therealssj> joannac: yes just a sec
[06:18:47] <joannac> preferably output from a mongo shell, but output from pymongo is fine too at a pinch
[06:19:19] <cheeser> is expire a date field?
[06:23:27] <therealssj> cheeser: yes utc datetime stored as an ISODate
[06:23:52] <therealssj> joannac: http://pastebin.com/Ku9pecpV this is the list i got from pymongo
[06:23:58] <therealssj> not very readable I guess :/
[06:25:00] <cheeser> in the shell do a findOne() on a document that should be gone and pastebin that.
[06:32:58] <therealssj> joannac: http://pastebin.com/jtTJSBPq
[06:33:01] <therealssj> cheeser: ^^
[06:33:15] <therealssj> i had dropped the database so had to insert some new data
[06:34:00] <joannac> okay
[06:34:26] <joannac> well, now we need to wait an hour. I expect to be afk by then
[06:34:38] <joannac> so maybe insert some documents with dates in the past?
[06:35:19] <cheeser> yeah. then wait one minute.
[06:36:41] <therealssj> joannac: that is what my doubt is. that expire is for the whole collection
[06:37:28] <therealssj> joannac: how would I set a expire for a specific docuement for a specific time let say 5 minutes
[06:37:38] <therealssj> for another one let say 6 minutes
[06:37:42] <therealssj> or is that not possible
[06:37:44] <joannac> well, your expire is for 1 hr
[06:37:56] <joannac> so if you want it to expire in 5 minutes, set the date to be 55 mins in the past
[06:38:06] <joannac> so if you want it to expire in 6 minutes, set the date to be 54 mins in the past
[06:38:15] <joannac> or set your expireAfterSeconds to be 0
[06:38:24] <joannac> and then set the expire field to be the time you want it to expire
[06:38:40] <therealssj> joannac: ohh :o
[06:38:53] <therealssj> thanks let me try that :)
[06:39:14] <joannac> this is all covered in the docs, btw
[06:39:16] <joannac> https://docs.mongodb.com/manual/tutorial/expire-data/#expire-documents-at-a-specific-clock-time
[06:39:48] <therealssj> oh i missed that :O
[06:39:56] <therealssj> I got so cofused between the docs
[06:41:10] <therealssj> the code there normally didnt work for pymongo :/
[06:41:39] <therealssj> but there is a separate link for pymongo too
[06:41:53] <therealssj> and it got very confusing after one point :/
[06:48:13] <therealssj> joannac: thnx it worked :D
[08:12:11] <Kamihan> Hello. I am running npm parse server, nodejs, and mongodb on my local machine. I have an app that uses Parse to register uses to the Mongodb, but when I do "show dbs" in mongo shell, all of them are empty. Where is Parse saving users in mongodb?
[08:12:26] <Kamihan> register users*
[09:59:56] <wim_PA> Not sure if this is the right place to ask but I'm experiencing some weird behavior benchmarking the PHP mongo and mongodb extension. Running the following code with the mongo extension http://pastebin.com/rxZsRz9J gives me an instant duplicate key exception with the same ObjectId(). However, if I place the $result assignment inside the for loop it works fine. Is this intended behavior or a bug?
[10:03:16] <GothAlice> wim_PA: Not a bug, your example code is bad. When inserting a record, the driver assigns an ObjectId to the _id key of your assoc. array (if an _id key is not already present). Thus, the first iteration will run fine, add the _id, then each subsequent iteration will attempt to re-insert the record with the same _id, giving you the duplicate key error.
[10:04:01] <Derick> wim_PA: that behaviour is no longer there in the new mongodb driver though - you should think about upgrading.
[10:04:11] <wim_PA> I noticed
[10:05:05] <wim_PA> Thanks
[10:06:38] <GothAlice> Always best to be current with driver updates; sometimes the communications protocol changes (for example, adding an updated authentication mechanism), so if your host updates mongod and you don't have an up-to-date driver, it can cause problems.
[10:07:08] <GothAlice> (Recently had a bit of a panic about pretty much just that at work while I was on vacation. ;)
[10:14:25] <wim_PA> Absolutely agree, the example code was just meant to benchmark the old and the new extension. Because if I want to use the new extension I have to do some refactoring of my codebase.
[10:17:21] <wim_PA> Preliminary results show that the new extension is actually a bit faster which I didn't anticipate considering a lot of the 'query code' has been moved to userland. I'm probably missing some 'first' call overhead or something. Not sure yet
[10:20:32] <GothAlice> wim_PA: Also notably, for many inserts a better pattern is to use bulk inserts. Tight loops of bare inserts aren't overly efficient. Ref: http://mongodb.github.io/mongo-php-library/api/class-MongoDB.Operation.BulkWrite.html
[10:21:29] <GothAlice> (Tight loops of bare inserts will introduce more variability from network effects, for example, as each insert is independently confirmed with a full network roundtrip.)
[10:24:27] <wim_PA> That's good to know, I have it on my todo :)
[10:24:40] <wim_PA> However the biggest wins were actually made on 'find()'
[10:25:56] <Derick> wim_PA: our benchmarks showed it too - are you running PHP with opcache enabled?
[10:28:08] <wim_PA> I tried with and without
[10:34:14] <wim_PA> The insert appears to be a lot slower with the new driver (insertOne() vs insert())
[10:35:58] <wim_PA> Derick: Are there any benchmarks online to verify my findings? Couldn't find them on Google which is why I decided to run my own benchmarks and get a better understanding on the performance impact
[10:36:31] <Derick> wim_PA: sorry, we don't publish our benchmarks.
[10:36:41] <wim_PA> Ok
[13:06:17] <wim_PA> Derick: I expected the 'first-time' overhead from the new PHP MongoDB extension to come from the userland library but that doesn't seem to the case. The first time MongoDB\Driver\Server->executeQuery() is called It takes 20-28ms for a simple findOne() query on a very small dataset. Any query after appears to be blazing fast. Where does the overhead come from as the old extension has much less.
[13:08:20] <GothAlice> wim_PA: Likely lazy connection handling.
[13:08:35] <GothAlice> I.e. the old driver would eagerly connect its pool, the new driver likely waits for the first actual use.
[13:08:52] <GothAlice> Do you get the same delay on first insert if you query (even a non-existent collection) first?
[13:11:01] <GothAlice> Right; insert a long delay between setting up the connection and your first query; check netstat or similar to see if there is a real connection, then have a delay after the first query, check netstat again; that'll see explicitly if lazy connecting is the cause of that delay.
[13:11:22] <wim_PA> Ok, will check
[13:17:36] <wim_PA> Ah yes no connection is made indeed
[13:24:54] <GothAlice> :)
[13:28:16] <wim_PA> Better this way indeed. However I can't seem to understand how this can cause a 66% req/s drop .. 450 ->150. More testing required! Thanks :)
[13:33:54] <Derick> wim_PA: yeah, definitely the initial connection
[13:34:41] <wim_PA> But it seems that the old extension does not suffer from such a delay making the connection
[13:35:16] <Derick> wim_PA: it makes the connection at new MongoClient
[13:37:16] <wim_PA> Double checking something
[13:51:55] <wim_PA> Derick: Yeah, that's what I had to double check, if I included the instantiation of MongoClient within the benchmark and it seems I did. I made a smaller benchmark only for the connection part: http://pastebin.com/m5X02v9P Not sure if that's correct but it shows from connection to first query the old extension seems to be twice as fast. hmm
[15:24:07] <Jimu> Any performance experts here? My database has 49 million documents, and simple increment ($inc) commands are taking 30-400 seconds
[17:14:11] <Jimu> Don't everyone type at once
[17:50:38] <GothAlice> Jimu: Busy workday. HTDSA request/response signing and SSO exchanges, oh my!
[17:51:11] <GothAlice> Jimu: Step one is usually: get an explain() of the query and make sure it's using an index. No index = crawls your records = you're gonna have a bad time.
[19:01:29] <deathanchor> don't forget to check it's using the right index, or else you're gonna have a bad time.
[19:09:33] <idd2d_> ls
[19:09:49] <idd2d_> :x
[19:10:58] <n1colas> Hello
[19:11:48] <idd2d_> I have a collection with about 200 million records. Each doc has about 40 fields. I need to be able to query on any number & any combination of these fields. So far, this has been a performance issue because I can't index every single combination of fields. I'm not a mongo-pro, wondering if there's a good solution for this.
[19:12:16] <cheeser> if you're going to query against a field, it's going to need an index.
[19:12:27] <idd2d_> uh oh.
[19:12:37] <cheeser> look at your query patterns. you can use those ot infer the proper indexing.
[19:12:42] <idd2d_> That's my take as well. I was hoping there was some feature I'm not familiar with.
[19:12:48] <idd2d_> a silver bullet
[19:12:54] <cheeser> now, you *might* be able to get away without indexing the less common ones.
[19:13:16] <idd2d_> right. That makes sense. My fear is that our indexes will grow so large that the app can't scale.
[19:13:20] <cheeser> if other fields are indexed, the query plan will use that index. but then to match the remaining fields, it'll have to scan the matched set.
[19:13:51] <cheeser> that might suggest a data model issue. hard to say for sure without seeing that model but that's something to consider.
[19:13:55] <idd2d_> Is it true that indexing each *individual* field is insufficient? I would need to index for combinations of field corresponding to common queries, as well?
[19:14:00] <idd2d_> Sure, we're thinking about that as well.
[19:14:22] <cheeser> yes. you'd need compound indexes covering most/all the fields being matched.
[19:15:48] <idd2d_> That makes sense. This confirms my fears, but we'll figure it out. Thanks for the tips!
[19:15:56] <cheeser> yep
[20:25:40] <idd2d_> Is there a standout search engine solution for Mongo? I've heard of people using Elasticsearch, Solr, etc. but I'm not sure which one might be best. Does it depend on the particulars of the data to be searched?
[20:35:42] <master_op> in java: i always have this exception
[20:35:43] <master_op> INFOS: Exception in monitor thread while connecting to server localhost:27017 com.mongodb.MongoSocketOpenException: Exception opening socket
[20:36:02] <master_op> but can't catch it even with catch (Exception e) block
[20:36:43] <cheeser> it's just an Exception so catch() will catch that.
[20:37:42] <master_op> cheeser: it occurs on connection, my program print this in the console and after the timeout expire it handle the exception
[20:38:02] <master_op> why i can't catch it directly
[20:38:08] <cheeser> connections are lazily init'd in the java driver.
[20:38:24] <cheeser> so nothing tries to connect until you try to do something on the server
[20:39:28] <master_op> yes, i know, i wrote this .mongoClient.getDatabase(mongoDBName); to force connection
[20:42:04] <master_op> sorry i mean this: mongoClient.getAddress();
[20:43:45] <cheeser> you should just pastebin your code somewhere
[20:44:01] <master_op> ok
[20:55:32] <master_op> cheeser: http://pastebin.com/DYFwRXbv
[21:04:55] <cheeser> and how does that code behave?
[21:09:43] <master_op> http://pastebin.com/XcN1bgEt
[21:10:19] <cheeser> oh, i see what's happening.
[21:10:33] <cheeser> the error you're seeing in the log is off on the monitoring thread.
[21:10:47] <cheeser> since it's off in a separate thread, your catch won't see it.
[21:11:23] <master_op> okay, what do you suggest to handle it ?
[21:12:42] <cheeser> probably there's no mongod running on that host/port
[21:14:09] <master_op> yes, closed mongo server just to handle this exception properly
[21:14:22] <master_op> i mean it's expected
[21:43:49] <angular_mike> are pymongo versions same as their supported mongodb versions?
[21:44:10] <cheeser> i don't think so.
[21:45:58] <angular_mike> I'm working on a python project that uses pymongo and I want to somehow enforce that it only supports mongodb versions starting with 3.2
[21:46:21] <cheeser> well, use a 3.2 feature and you're pretty much done
[21:47:28] <angular_mike> cheeser: it would be better to catch this during the application installation
[21:47:46] <angular_mike> or at least ASAP when starting it
[21:48:28] <GothAlice> angular_mike: You can pin versions in your project or library's setup.py file's install_requires section, for example, like this: https://github.com/marrow/contentment/blob/develop/setup.py#L105
[21:48:48] <GothAlice> That'll ensure the dependency is resolved at install or "setup.py develop" time.
[21:50:38] <GothAlice> You can also do things like ranges. I.e. pymongo>=3.2,<3.4 (i.e. to avoid large breaking changes prior to getting a chance to vet them)
[21:50:44] <angular_mike> GothAlice: oh, there's a separate pymongo's dependency that's closer to mongodb versioning scheme?
[21:51:29] <angular_mike> so, mongoengine matches 1:1 the mongodb version it supports
[21:51:32] <angular_mike> good
[21:51:41] <GothAlice> Do not assume that.
[21:52:09] <GothAlice> There is no mention of that being a goal or feature in the README.
[21:52:11] <angular_mike> GothAlice: but I want the mongodb version dependency, not python module
[21:53:01] <GothAlice> Ah, somewhat different.
[21:54:36] <GothAlice> https://github.com/marrow/mongo/blob/develop/web/db/mongo/__init__.py?ts=4#L62-L67 < if you have connection code like this, add a call to client.server_info() to look up the version.
[21:54:50] <GothAlice> (Just after making the client object.)
[21:55:02] <GothAlice> That's a good idea.
[21:56:18] <angular_mike> GothAlice: that's not as good as failing during install, but I gues s better than nothing
[21:56:46] <GothAlice> Difficult to have an assumption about target MongoDB server at install-time, when that's usually an application configuration option requiring the app to already be installed.
[21:59:07] <angular_mike> GothAlice: ye, I was hoping I could extrapolate available mongodb version from pymongo version
[21:59:22] <GothAlice> They are installed separately, likely (and hopefully!) on separate machines.
[21:59:31] <GothAlice> So… not remotely close to being possible to associate the two.
[21:59:39] <angular_mike> like, it only starts supporting 3.2 starting from specfic version
[21:59:49] <GothAlice> Yeah, that'd make so many people upset it'd be silly.
[21:59:50] <angular_mike> $sample is the specific feature I want supported
[22:00:12] <GothAlice> https://github.com/marrow/mongo/commit/94395ba2c7df99377a62ed797640056e6f6a2440 < thanks for the idea to add that as a feature to my connection helper, BTW. :)
[22:01:25] <angular_mike> If there's any way I can help out, it's by being dissatisfied, I guess
[22:02:47] <GothAlice> https://twitter.com/taotetek/status/766288663504388096 directly relates to your wish.
[22:03:50] <angular_mike> no argument there, though that doesn't mean that as a developer you care about it
[22:04:24] <GothAlice> The needs of the many outweigh the needs of the one. ;P
[22:04:47] <angular_mike> my name is legion for we are many