pmxbot IRC Log Viewer

[02:41:08] <ss_> whtas the best way to get random documents from certain time range

[02:58:41] <bmillham> Hi again all

[03:00:02] <bmillham> I'm looking for help with a query that orders on an embedded documents field.

[03:01:12] <bmillham> It mostly works, but if the embedded document (a list of embedded documents) has more than one element, I'm not getting the expected results.

[03:03:01] <Boomtime> quote the query here

[03:03:54] <bmillham> Here a pastbin of the simple structure, and the results: http://pastebin.com/D606jDxs

[03:04:24] <bmillham> From the mongo shell

[03:06:50] <Boomtime> not too surprising

[03:07:20] <Boomtime> anyway, can you provide the list as you expected?

[03:07:35] <bmillham> What I was expecting (or really wanting) was a complete history of played, so the 'title1' track should show up twice

[03:07:48] <Boomtime> I assume you have an index as { "played.date": 1 }

[03:08:23] <Boomtime> ok, so it is there twice in the index as you can see, but the cursor prevents returning the same document twice

[03:08:47] <Boomtime> if you really want to unwind all possibilities then you should do this via aggregation

[03:09:12] <Boomtime> http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/

[03:09:32] <bmillham> I'm new to MongoDB (just started a few days ago, trying to move from MySQL)

[03:09:59] <Boomtime> abandon all of your SQL knowledge, it will only damage you

[03:10:18] <bmillham> I'm trying, but it's hard....

[03:11:34] <Boomtime> yeah.. you'll get there, the learning curve can be a bit steep at the start because you have to re-learn the language as well

[03:15:35] <bmillham> So looking at unwind and aggregate, how can I get that to work on the embedded document?

[03:18:07] <bmillham> I tried: > db.song.aggregate([{$unwind: "$played.date"}]) and get nothing back

[03:18:16] <bmillham> Just as a simple test

[03:24:56] <bmillham> Ah, got it: > db.song.aggregate([{$unwind: "$played"}, {$sort: {"played.date": -1}}])

[03:25:08] <bmillham> did the trick! Thanks!

[03:28:00] <s2013> anyone knows best way to get randomd ocuments from the last 5 days

[03:28:22] <s2013> i can do a count for all the docs in the last 5 days

[03:28:29] <s2013> and then do a random skip or something. but is there something more efficient

[03:40:53] <Boomtime> s2013: there is no built-in mechanism to select something at random, your idea to skip() will work well enough for many cases - though ensure it is indexed and sorted at least

[03:41:15] <s2013> hmm alright. but if i need to get multiple documents do i need to make multiple queries

[03:42:39] <s2013> also is there anyway to backup and restorte db remotely from one remote db to another?

[03:44:00] <cheeser> mongodump/mongorestore

[03:45:03] <Boomtime> s2013: to select multiple documents you either need to do multiple queries or use a more involved solution

[03:45:23] <Boomtime> (to select multiple documents _at random_)

[03:45:35] <s2013> cheeser, yeah but how do i do it remotely from one db to another

[03:45:47] <s2013> i cant get the backup locally

[03:46:01] <s2013> Boomtime, i see. what do you mean by more involved solution

[03:46:09] <s2013> you mean like change the schema a bit or something

[03:46:42] <Boomtime> s2013: by "backup" you describe a copy-database scenario like this I think: http://docs.mongodb.org/manual/reference/method/db.copyDatabase/

[03:47:28] <s2013> Boomtime, yeah let me check that out. thakns

[03:48:04] <aniasis> mongodb is basically a flat table?

[03:48:26] <aniasis> is there a record limit?

[03:49:22] <Boomtime> it is not exactly flat, documents can have sub-documents, including arrays

[03:49:43] <Boomtime> there is no limit to the number of distinct documents in a collection

[03:50:07] <Boomtime> "records" and "tables" are very SQL terms, you should not think in those terms, they will mislead you

[03:59:06] <s2013> Boomtime, if i want to return a documents with only its slugs

[03:59:11] <s2013> as an array, how would i do it?

[03:59:29] <s2013> my current query is db.collection.find({}, {slug: 1, _id: 0})

[03:59:49] <s2013> it returns the slugs but i think it returns a hash. i can convert that into an array but i sthere a better option

[04:02:39] <Boomtime> "it returns the slugs but i think it returns a hash" <- wut?

[04:03:09] <joannac> s2013: maybe an example would help, I think you mean "hash table" i.e. a subdocument

[04:03:10] <Boomtime> it should return every single document in the database but only the 'slugs' field if it has one

[04:03:43] <s2013> Boomtime, right.. but i only want an array of the slug values

[04:04:09] <s2013> currently it returns { "slug": "my-slug"} {"slug": "my-slug-2"}

[04:04:18] <s2013> i want an array ["my-slug", "my-slug-2"]

[04:04:22] <s2013> does that make sense

[04:04:36] <Boomtime> why don't you store it that way then?

[04:04:58] <Boomtime> oh wait... i understand

[04:05:04] <Boomtime> you want aggregation again

[04:05:21] <Boomtime> anytime you want to transform your data before it comes back fro the server, you want aggregation

[04:05:42] <s2013> ok

[04:05:49] <Boomtime> alternatively, it is not hard to build the array you want on the client-side

[04:05:55] <s2013> yeah

[04:06:02] <joannac> alternatively, if you want results like that, store it like that

[04:06:04] <s2013> i was wondering if htere was something from mongo that would return just an array

[04:06:10] <s2013> what do you mean store it like that

[04:06:12] <s2013> that doesnt make sense

[04:06:13] <Boomtime> yes, it's called aggregation

[04:06:40] <Boomtime> "if htere was something from mongo that would return just an array" = aggregation

[04:06:47] <s2013> k

[04:06:53] <joannac> if you always want results in one form, store it in that form e.g. slug: ["my-slug", "my-slug-2"]

[04:07:12] <s2013> joannac, they are multiple slugs

[04:07:25] <s2013> each document has a slug

[04:07:36] <joannac> oh! okay, then you want aggregation

[04:07:55] <joannac> unless... are they all distinct?

[04:08:01] <s2013> they are all distinct

[04:08:45] <joannac> you could try db.yourcoll.distinct("slug")

[04:09:12] <joannac> make sure it's indexed though, else that'll be slow

[04:10:33] <s2013> hmm can i filter that

[04:10:40] <s2013> cause distinct returns an array which is what i like

[04:10:45] <s2013> but i need to be able to filter it

[04:11:12] <joannac> http://docs.mongodb.org/manual/reference/method/db.collection.distinct/

[04:11:19] <s2013> k let me check that

[04:12:23] <s2013> how do i add a limit? hmm

[04:12:43] <joannac> aggregation

[04:13:13] <s2013> k im confused. distinct seems to work for wha ti need.. but aggregation from what i see is a bit overkill?

[04:16:55] <joannac> distinct does not work for what you need, it can't do a limit

[04:17:55] <s2013> k soim bit confused about aggregate

[04:18:17] <s2013> how would i rewrite this in aggregate.. collection.find().limit(3)

[04:20:30] <Boomtime> $match, $limit

[04:20:54] <Boomtime> match takes the query you already have, $limit takes the numeral

[04:21:07] <s2013> k i think it will be simpler for me to just convert this into array after i get it back

[04:21:26] <Boomtime> righto

[07:39:24] <Soothsayer_> Is it good practice to perform delete queries WITHOUT an index on a collection of > 100,000 documents? (Its a non-user facing query and runs in the background at various events)

[07:43:50] <joannac> Soothsayer_: deleting how? by _id?

[07:44:22] <Soothsayer_> joannac: no, by passing a query on multiple fields (3 to be precise)

[07:46:13] <joannac> how long does the query take to match?

[07:47:55] <Soothsayer_> in my development environment.. 35 ms +

[07:48:42] <Soothsayer_> joannac: but this is at a data set of 25,000 documents

[07:49:01] <joannac> if you can afford the hit, sure

[07:49:16] <joannac> is the index just not useful enough?

[07:49:25] <joannac> (i.e. why don

[07:49:43] <joannac> why don't you have an index if you're doing this semi-regularly)

[07:49:53] <Soothsayer_> joannac: sorry, takes around 15 ms on remove for a 25,000 record set. (I was doing a sort earlier too).

[07:50:24] <Soothsayer_> joannac: yes, the index is not useful anywhere else.. just for this particular query.

[07:50:55] <Soothsayer_> joannac: hmm true, its more than semi-regularl.. its pretty regular (whenever a Product gets deleted / disabled, this document has to be removed).

[07:51:04] <Soothsayer_> So it can be at any point, anything.

[09:39:39] <Soothsayer_> joannac: you around?

[10:35:47] <arussel> is there a helper in the shell to do a deep copy of an object ?

[12:29:05] <cxz> querying small collection with 136 documents...taking forever

[12:29:13] <cxz> what could be the causes?

[12:29:30] <cxz> i'm using pymongo

[12:29:41] <joannac> what's the query?

[12:30:52] <joannac> what do the logs show?

[12:31:21] <cxz> it's a simple db.mycollection.find()

[12:31:28] <cxz> the logs show that the query doesn't complete

[12:31:36] <cxz> it returns a few documents and then nothing else

[12:31:51] <cxz> this was instant last friday when i tried the same query, but now it's taking forever

[12:33:12] <joannac> same behaviour in the mongo shell?

[12:34:25] <joannac> try db.currentOp() in the shell

[12:35:19] <cxz> so the query looks fine actually. its the subsequent db.mycollection.save()

[12:35:24] <cxz> that's what's taking forever

[12:35:53] <joannac> other writes?

[12:39:34] <cxz> none else

[12:43:10] <sabrehagen> hi guys, i'm getting some really weird results from mongodb. i have an object in my database, and inspecting it via robomongo, it shows one set of data, but querying the document via mongoose, some fields are empty. when i update the fields of the document that do show, they change, so i know i'm looking at the right document. what might be happening? (see "team":[]) http://i.imgur.com/Z9iEL9J.png

[12:51:42] <sabrehagen> find my solution, thanks

[12:51:48] <sabrehagen> mongoose model configured incorrectly

[13:18:00] <roadrunneratwast> I am starting with Mongoose. Is it possible to define a subtype without making a document? I am developing a taxononmy, each taxon of which has an image, a name, and a comment. I don't think I want to store the taxons in a separte document (do I?) but I would like each node of the taxonomy to have one of these

[14:05:41] <kephu> hi

[14:06:06] <kephu> I was wondering, are there any tools to reduce the ridiculous verbosity of aggregate pipeline's statements?

[14:20:18] <cihangir> hi all, i have an int property and i want to update that field, i can also use $inc. which one is faster $set or $inc?

[14:20:54] <kephu> my money would be on $inc, but I didn't run any benchmarks on it or anything

[14:21:18] <Derick> cihangir: how would you be using the $set ?

[14:21:32] <Derick> don't do a "find document, calculate new value, set with $set"...

[14:21:59] <cihangir> i know the old value before hand

[14:22:54] <Derick> I'd go with $inc, but I bet the speed difference is negliable

[15:01:36] <Constg> Hello, I need your help, as I'm going mad.

[15:02:03] <Constg> I do an update: var test = db.users_done.update(...)

[15:02:45] <Constg> all is ok in the database, and if I do print(test), then I see nUpserted:1, _id:ObjectId("...")

[15:02:59] <Constg> And...impossible to get this _id except with print

[15:03:20] <Constg> test.nUpserted display 1, but test._id display undefined.

[15:03:38] <Constg> Do you have any idea why, and then how to get the last inserted id?

[15:03:53] <Constg> I use no client from language, I'm in Mongo shell

[15:04:09] <blaubarschbube> hi. we are facing a problem while replication. it hangs on moving a chunk. error message is "InsertDocument :: caused by :: 17280 Btree::insert: key too large to index". how can we get rid of that? is it possible to just skip error causing chunks?

[15:04:39] <Constg> blaubarschbube, do you use GridFS?

[15:04:47] <blaubarschbube> no

[15:05:18] <Constg> GridFS is use to store object bigger than regular chunk

[15:06:54] <Constg> http://docs.mongodb.org/manual/core/gridfs/

[15:06:54] <blaubarschbube> but it is not about the size of the chunk but the size of the index key

[15:07:19] <Constg> how long is your key?

[15:07:56] <Constg> if it's bigger than 16MB, then the problem is the same

[15:08:12] <Constg> you key, as an object, cannot be bigger than 16MB

[15:08:20] <Constg> or, if you use GridFS

[15:08:32] <blaubarschbube> not sure right now.. its an URL

[15:08:56] <kali> no. the index key limit is not 16MB. it's somewhere around 700B

[15:09:00] <Constg> hum... an URL bigger than 16MB should be a very big one, I think the problem is not here, indeed :)

[15:09:43] <Constg> kali's right!

[15:09:55] <kali> so... yeah. ugly URLs can get there

[15:10:00] <blaubarschbube> hm.. ok.

[15:10:06] <cheeser> any url that big should result in a flogging.

[15:10:30] <Constg> maybe you could hash it in md5 per exemple

[15:10:38] <kali> cheeser: agreed.

[15:10:40] <Constg> and store the clear value in the doc

[15:10:47] <blaubarschbube> cheeser, i will adress that :)

[15:11:00] <cheeser> blaubarschbube: :D

[15:11:40] <blaubarschbube> so theres no possibility to skip or ignore that chunk?

[15:27:23] <elcha> how can I check that mongo is running with the recommended ulimit values? I noticed that there is an upstart script mongod.conf in my /etc/init with the 64000 open files stanza etc however if I cat the proc id limits for mongod I see the default 1024 open files setting

[15:33:00] <Constg> Ok, I ask again, from Mongo shell, in javascript, how to get the last inserted _id please? If I print() my insert, I can see a Writeresult object including the _id, but I can't access it.

[15:33:47] <kali> Constg: you can't. but what you can do is set the _id yourself (the driver just do new ObjectId()) just before inserting

[15:35:58] <Constg> kali, is it as fast as not specifying one? If yes, how could I manage to create the id with update and upsert = true? With a new _id, the object will never be the same and it will result to insert a new object everytime. Or, could I specify the _id in $setOnInsert only?

[15:37:22] <kali> Constg: for the performance, it should be about the same, for the upsert... $setOnInsert may work, you need to try it

[15:37:35] <Constg> ok, trying... :)

[15:40:28] <Constg> kali, it worked!

[15:40:35] <Constg> Thank you very much ;)

[15:40:35] <kali> good

[15:44:08] <Constg> Arg, in fact, no, it doesn't help.... As what I want is to get the _id even if it was not an upsert, but an update too... :-|

[15:44:54] <Constg> I'm afraid I'll have to query the collection for the _id

[15:45:09] <kali> Constg: would it not make sense to use "your" unique key as the _id ?

[15:46:57] <Constg> no, in fact here is what I do: I create an entry in collection1, and updating another collection: collection2. If collection2 does know the user in collection1, then it will create it, and return the _id, to insert it in collection1 to make the link between both. In this case, create the objectid before works.

[15:47:26] <Constg> but if the user already exsits in collection2, then how do I link my new entry in collection1 to the same user

[15:49:41] <Constg> kali, I'm not sure I'm clear...do you understand? ^^

[15:50:09] <kali> nope :)

[15:50:26] <Constg> ok

[15:50:41] <Constg> I have two collection: users and events.

[15:51:02] <Constg> when adding an event, I $inc: 1 into users' document

[15:51:40] <Constg> but I do not have the user's _id when I receive the event

[15:51:58] <Constg> II do my update using another field, another ID: myid

[15:52:22] <Constg> so my update is lik update({myid:1}, myupdateObject)

[15:52:59] <Constg> and on update (not upsert), I'd like to get the _id updated by my update()

[15:53:17] <Constg> to insert it in my events' doc.

[15:53:25] <Constg> kali, is it more clear? :-/

[15:55:04] <Constg> but in fact, I think update() will do a find() anyway...So I'd better do it myself explicitely and retreive the _id if exists

[15:55:15] <kali> yeah, my point was: why don't you use YOUR id instead of generated objectIds ?

[15:55:52] <Constg> kali, ah, because it's a migration to an old_id to a new one :-|

[15:56:16] <Constg> and a new process of updating users.

[15:56:20] <kali> ha

[15:57:05] <Constg> but I'll do that in another way: find(old_id) if not exists, then I create a new _id

[15:57:08] <appledash> I'm running mongodb on a server with very low storage space... I found about the storage.smallFiles option to avoid creating massive journal files

[15:57:12] <appledash> But how do I SET that option

[15:57:23] <appledash> adding storage.smallFiles=true in mongodb.conf says unknown option

[15:57:41] <Constg> Thanks for your help kali

[15:57:53] <Constg> You helped me to find that I'm in the wrong way ;)

[16:40:43] <Sengoku> Hey "org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'solrCrudRepository': Invocation of init method failed; nested exception is org.springframework.data.mapping.model.MappingException: Could not lookup mapping metadata for domain class java.lang.Object!

[16:41:51] <Sengoku> Where do I start debugging

[16:55:35] <GothAlice> Sengoku: I would recommend asking in ##java — that doesn't look like a MongoDB issue.

[16:56:07] <GothAlice> (Also seems there are no Java gurus awake at the moment; sorry 'bout that. ;)

[16:56:45] <cheeser> i'm awake. i just don't use spring-data :)

[16:58:45] <GothAlice> s/awake/awake that can help you/

[16:58:46] <GothAlice> ;^P

[16:59:07] <aliasc> Hi !

[16:59:22] <GothAlice> aliasc: Howdy!

[17:00:07] <aliasc> Hows goin.

[17:00:32] <aliasc> So its me again with my game trying to decide whether mongodb is the perfect choice to handle gaming overheads

[17:00:38] <GothAlice> Another ticket closed, 'tis a good day.

[17:01:38] <GothAlice> aliasc: All I can say is that yes, MongoDB is capable of supporting Facebook-game style loads (via a standard webapp interface) with up to a million simultaneous users in our own testing. (We didn't bother testing higher… we were going to cross that bridge when we got to it.)

[17:02:31] <GothAlice> http://www.slideshare.net/montrealpython/mp24-the-bachelor-a-facebook-game was our presentation on our architecture.

[17:02:58] <aliasc> Nope. Its not web interface

[17:03:07] <GothAlice> Do you have an API?

[17:03:12] <GothAlice> (That native clients use?)

[17:03:19] <aliasc> I've implemented the client/server style

[17:03:30] <GothAlice> Well, then it's not much different, if at all.

[17:03:32] <aliasc> its a multiplayer game

[17:03:48] <GothAlice> So was The Bachelor. ;)

[17:04:34] <aliasc> Well. The game will store object data and this can get cpu sensitive.

[17:05:07] <aliasc> Its coop like game where two or more players advance the game and the progress, including game objects positions etc are stored in coordinates

[17:05:10] <aliasc> pure strings

[17:05:24] <GothAlice> The pure string thing was a point of contention in our last discussion.

[17:05:31] <GothAlice> MongoDB can do far better than that.

[17:05:43] <GothAlice> What serialization format are you using on your wire protocol?

[17:06:00] <aliasc> what do you mean.

[17:06:40] <GothAlice> You said you implemented client/server. What're the format of the "packets" being sent between them? JSON strings, Bencode binary coding, Google Protobuf, packed C structs, …

[17:06:55] <aliasc> JSON.

[17:06:58] <aliasc> Let me explain something.

[17:07:05] <aliasc> the client. needs a string.

[17:07:41] <GothAlice> That sounds badly engineered. Internally it'll still have to explode out the packed values…

[17:08:06] <aliasc> True. The data are further processed by the client.

[17:08:28] <GothAlice> And you can entirely avoid serialization/deserialization if you use BSON as your wire protocol. You'd be able to retrieve from MongoDB, then need to do *nothing* to the data before passing it off to the client. :)

[17:09:00] <cheeser> +1

[17:09:05] <aliasc> Okay. What about data types. Remember that those string coordinates need to be converted to actual floating point numbers

[17:09:19] <GothAlice> MongoDB uses 64-bit IEEE "double precision" floating point.

[17:09:22] <GothAlice> Which is typical of most languages.

[17:09:53] <GothAlice> … so again, you could keep them as floats end-to-end and not need extra conversions anywhere.

[17:09:58] <aliasc> Hm. So BSON data types can be attached directly without conversion

[17:10:20] <aliasc> Sounds great.

[17:10:24] <GothAlice> http://bsonspec.org/spec.html — it's a very simple and clean specification, and there are library implementations for almost everything: http://bsonspec.org/implementations.html

[17:10:52] <GothAlice> (With an appropriate offset into the data, you could use the floats straight out of the packed data!)

[17:11:15] <aliasc> Well currently the client gets a json string explodes the coordinates does the conversion then the rendering

[17:11:58] <GothAlice> In the way suggested above your client would get the BSON blob, find the indexes to the coordinates, and pass those pointers off to the rendering.

[17:12:07] <aliasc> Its seems like i need to stop thinking of MongoDB as a web technology only.

[17:12:16] <aliasc> Im used with PHP NodeJS and web interfaces

[17:12:26] <GothAlice> aliasc: I use it as a massive-scale FUSE filesystem. (25 TiB of data in GridFS and counting…)

[17:13:33] <aliasc> the problem is, i was working on web technologies and RDMBS for the most part. I cant avoid thinking of it as just a web interface.

[17:13:56] <aliasc> This is why i engineered the client as a browser asking for a specific string.

[17:14:04] <GothAlice> Except MongoDB is clients/server, speaking BSON, just like your own client/server isn't "web".

[17:14:33] <GothAlice> (However all of these are basically identical: they're stateful transactional request/response cycles.)

[17:14:50] <aliasc> Yea true. And i love MongoDB. I dont want to be dissapointed. Its scalable, easy to implement and fast.

[17:20:36] <GothAlice> aliasc: My mantra is usually "help MongoDB help you make cleaner apps" — storing "x:1,y:2" as a string instead of a native structure is indicative of picking the wrong fight.

[17:20:38] <GothAlice> With the result of: client packs string, client bundles in JSON, client fires over the wire, server receives, unpacks JSON, packs BSON, sends the wrapped string to MongoDB. Subsequently MongoDB sends the BSON to the server, the server unpacks, re-packs into JSON, sends to a client, the client receives, unpacks the JSON, unpacks the string, then does something with the numbers.

[17:21:00] <GothAlice> ^ That's 8 conversions.

[17:21:08] <GothAlice> And should make anyone go 'wat'.

[17:23:34] <cheeser> i worked at a company that decided early on to store a certain json document as a string and regretted ever since.

[17:24:51] <omid8bimo> hello, i need help, im keep getting this errors on my 2 new secondary servers and they are falling behind primary more every minute that passes by

[17:24:54] <omid8bimo> 2014-12-08T20:50:44.381+0330 [rsBackgroundSync] Socket recv() timeout 78.47.181.74:27017

[17:24:56] <omid8bimo> 2014-12-08T20:50:44.381+0330 [rsBackgroundSync] SocketException: remote: 78.47.181.74:27017 error: 9001 socket exception [RECV_TIMEOUT] server [78.47.181.74:27017]

[17:24:58] <omid8bimo> 2014-12-08T20:50:44.381+0330 [rsBackgroundSync] replSet sync source problem: 10278 dbclient error communicating with server: kookoja-db2:27017

[17:25:00] <GothAlice> omid8bimo: STOP

[17:25:00] <omid8bimo> 2014-12-08T20:50:44.381+0330 [rsBackgroundSync] replSet syncing to: kookoja-db1:27017

[17:25:02] <omid8bimo> 2014-12-08T20:50:45.931+0330 [rsBackgroundSync] replset setting syncSourceFeedback to kookoja-db1:27017

[17:25:04] <omid8bimo> any idea?

[17:25:12] <omid8bimo> i tested and the new servers can ping the primary

[17:25:24] <GothAlice> omid8bimo: So, step one, never paste large chunks of text into an IRC channel. Use gist or another pastebin service.

[17:25:27] <omid8bimo> also i tried "telnet primary-hostname 27017" and it works

[17:25:37] <omid8bimo> GothAlice: ok sure

[17:26:22] <GothAlice> omid8bimo: Because I can't be sure that I received all of that (and I likely didn't; I've got paranoid flood protection enabled), could you try that again gist'ed?

[17:26:52] <omid8bimo> yeah, just a second

[17:27:22] <omid8bimo> https://gist.github.com/anonymous/d64e6031eaf02eab3c41

[17:27:24] <GothAlice> ty

[17:27:52] <GothAlice> What type of network is this cluster running on?

[17:28:31] <omid8bimo> the primary is on a datacenter in germany but the 2 new secondaries are on a datacenter in asia

[17:29:38] <omid8bimo> is this merely a connection issue (despite that ping works) or something else

[17:29:42] <omid8bimo> ?

[17:29:44] <GothAlice> Well. That might explain an unreliable connection. What's a traceroute like between the two DCs?

[17:29:56] <GothAlice> (Preferably the result of "mtr" after 30 seconds or so.)

[17:31:37] <omid8bimo> let me test

[17:32:57] <lxsameer> how can find a file in grid_fs ?

[17:33:29] <GothAlice> lxsameer: GridFS uses two collections, a metadata collection and a "chunks" collection to store the actual data.

[17:33:36] <GothAlice> lxsameer: You can query the metadata collection just like any other.

[17:34:22] <GothAlice> You can also use the command-line "mongofiles" helper command. For details, see: http://docs.mongodb.org/manual/core/gridfs/

[17:34:35] <lxsameer> I'm using a gem

[17:34:56] <lxsameer> but thanks my friend

[17:34:57] <GothAlice> The significance of using a gem?

[17:36:17] <GothAlice> I.e. regardless of language or driver, querying "fs.files"—the metadata collection—will let you "find a file". http://docs.mongodb.org/manual/reference/gridfs/#the-files-collection < It has a pretty simple structure.

[17:37:57] <GothAlice> omid8bimo: http://docs.mongodb.org/manual/faq/diagnostics/#does-tcp-keepalive-time-affect-sharded-clusters-and-replica-sets < This may help.

[17:39:30] <aliasc> GothAlice

[17:39:34] <aliasc> thank you for your help

[17:39:43] <aliasc> can you explain whats the best approach i should take

[17:40:03] <GothAlice> aliasc: Is your client written in C or C++? (I.e. do you have easy access to pointers?)

[17:41:12] <GothAlice> Well, even if you don't… using BSON as your own application's wire format will allow you to easily pass along data from MongoDB without requiring any additional work on the part of your server. This is a huge plus. The format is relatively compact (not as compact as packed C structs, but it's still pretty good) and explicitly has lengths on everything, so it's easy to stream.

[17:43:02] <GothAlice> And certainly, avoiding extra conversions (floats->string->json->string->bson->string->json->string->floats) is a must. This would also allow you to perform statistical analysis using MongoDB, as it'd understand that those are actually numbers. It can work with numbers. :)

[17:44:01] <GothAlice> And it's way more packed than JSON for most data. (Numbers esp., since in JSON they're variable length strings.)

[17:59:29] <GothAlice> aliasc: An example packed record containing a player ID and coords: 4_byte_length + "\x10p\x00" + 4_byte_player_id + "\x01x\x00" + 8_byte_x_coord_float + "\x01y\x00" + 8_byte_y_coord_float + "\x00" = 34 bytes. 38 if you need 64-bit player IDs.

[18:00:16] <GothAlice> In hex: 22000000107000abe72900017800000000000000f03f017900000000000000004000 — player 2746283, x 1.0, y 2.0

[18:02:26] <drags> Hello friends, if I'm writing a mongo shell script (as in, to be executed as `mongo myScript.js`), is there something akin to runCommand that accepts queries writen as strings?

[18:02:42] <drags> I have an array of queries stored as strings which I want to run, not sure how to execute them

[18:04:04] <GothAlice> drags: I'm not sure how you're storing them as strings, considering queries are formed from rich, potentially deep structures.

[18:04:20] <GothAlice> drags: If you have them JSON encoded, drop the surrounding quotes and bam, that's not a string that's real data now.

[18:07:08] <GothAlice> drags: If they're JSON and you simply want to evaluate them as-needed, run the string through JSON.parse() before passing as the query.

[18:09:19] <drags> GothAlice: ah, so I should say, not just queries, but the whole query line including the "db.<collection>" component, which changes from query to query

[18:09:41] <GothAlice> That… is terrifyingly gross. Eval would be your only "real" option, there.

[18:09:45] <drags> is that something I can store and then re-call&execute?

[18:09:56] <GothAlice> Aye.

[18:10:10] <GothAlice> db[collectionvar] works just as well as db.foo (where collectionvar = "foo")

[18:10:13] <drags> GothAlice: yeah it's intended for a quick internal reporting thing, but I'm thinking it might not be appropriate to do in this manner

[18:10:34] <drags> excellent, that gets me a little closer

[18:10:37] <GothAlice> :)

[18:10:49] <drags> I'll play around with this and see what I can do. Thanks GothAlice :)

[18:11:00] <GothAlice> One of the few features I actually like in JS: Objects can be accessed via array notation, not just attribute access.

[18:31:07] <wroathe> Question: while reading through the mongo documentation I noticed the following in the FAQs: "MongoDB automatically uses all free memory on the machine as its cache." Do you guys think an argument could be made that using memcached alongside mongodb is overkill, even for large, distributed, applications?

[18:31:37] <wroathe> That is, if your data has been properly normalized (when appropriate) or embedded (when appropriate)

[18:32:27] <GothAlice> wroathe: We may be a little unusual at work, but we use MongoDB as a cache (using TTL indexes), session storage, near-zero-latency queue (capped collections and tailing cursors), as well as general data store.

[18:33:38] <wroathe> That's what I was thinking. It seems like MongoDB would work very effectively as an alternative to memcached.

[18:34:02] <GothAlice> One worthy note: TTL indexes are minute-accurate.

[19:00:55] <jonasliljestrand> Hey guys, are there any limits to db.colleciton.insert ? Like how many documents

[19:01:16] <jonasliljestrand> for a single batch

[19:01:48] <cheeser> batches are limited to 1000 per but the driver will usually transparently break that up for you.

[19:02:11] <jonasliljestrand> Thats really good!

[19:02:33] <jonasliljestrand> Ok so i have to dig into my drivers documentation to see if they do so :)

[19:35:24] <kibibyte> hi

[19:36:04] <wsmoak> hi kibibyte

[19:37:43] <kibibyte> i have question, what should i use for checking for error when instertin element. WriteResult or catch mongoexeception?

[19:40:23] <cheeser> MongoException

[19:43:23] <kibibyte> so whats for WriteResult ?

[19:43:27] <kibibyte> thn

[19:43:29] <kibibyte> then

[19:43:52] <cheeser> for checking things like documents update/written.

[21:03:11] <spenguin> should "mongodb://localhost:27017/test" be sufficient to specify a mongodb connection at localhost on port 27017 using the "test" database?

[21:03:58] <cek> Do I understand correctly mongodb can hold big tables and that bson doc restriction of 16MB is only for one "record", not for the whole "database"?

[21:04:06] <spenguin> cek: yes

[21:04:33] <spenguin> cek: the 16 MB limit is per-document (as far as I know)

[21:05:28] <cek> What would you use to make mongodb data searchable? plug in an elasticsearch?

[21:05:32] <spenguin> http://docs.mongodb.org/manual/reference/limits/

[21:05:42] <spenguin> how do you mean?

[21:06:22] <cek> where user LIKE 'blah%'

[21:06:41] <spenguin> mongodb has its own query syntax

[21:07:09] <spenguin> (and I'm not too familiar with SQL queries)

[21:07:45] <spenguin> cek: are you trying to do fuzzy matching?

[21:14:46] <BigOrangeSU> Hi all, does anyone if its possible to generate a mongo object id for a given ts? I am trying to back populate some records?

[21:18:07] <cheeser> depends on the driver, i'd expect. you can pass in a Date object using java

[21:20:42] <BigOrangeSU> do you know if its possible in pymongo? I don't think there is anything in the doc, http://api.mongodb.org/python/current/api/bson/objectid.html

[21:22:53] <joannac> BigOrangeSU: what's the use case?

[21:23:34] <joannac> there is a very clear example of how to do it on that page, as well as a big warning that doing so is not safe for insertion since you lose the uniqueness guarantee

[21:24:09] <BigOrangeSU> joannac: Long story, but I ETL mongo db to sql then i use aggregation and select on min(_id). These are records of tracking a user on our site, so I am getting the first record for a given session. However I have to backpopulate some data

[21:24:43] <BigOrangeSU> joannac: thanks yea i see it (though I wont use it)

[21:28:50] <joannac> BigOrangeSU: how many documents do you need to populate?

[21:29:02] <BigOrangeSU> joannac: ~200

[21:29:04] <joannac> are they all in the same timeperiod?

[21:29:09] <BigOrangeSU> no

[21:29:31] <BigOrangeSU> i think im going to just find another way to do it

[21:30:46] <joannac> yeah, i don't have a good solution for you

[21:31:27] <joannac> BigOrangeSU: you have these original times elsewhere in your document, right? why not sort on those?

[21:38:37] <BigOrangeSU> joannac: i do, its just that I have so many queries I would have to update

[21:39:15] <BigOrangeSU> joannac: looking back I was trying to save myself an extra join and simplify the queries by doing min(_id),

[21:39:44] <BigOrangeSU> rather then doing a join back to get columns outside of an aggregate if that makes any sense

[21:59:21] <spenguin> joannac: should "mongodb://localhost/test" be sufficient to specify a mongodb connection at localhost on port 27017 using the "test" database, or do you have to supply credentials before specifying a different database?

[23:01:24] <blizzow> How can I find the most recently updated/inserted document in a collection?

[23:02:15] <Boomtime> you have a field in your schema called "last_modified" or some such

[23:03:54] <Boomtime> for example, when you do an update or insert op, add { $currentDate: { last_modified: true } }

[23:04:14] <Boomtime> (check my syntax i didn't test it, but it should be something like that)

[23:16:37] <kenalex> hello

[23:23:39] <Boomtime> hi there!

Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 8th of December, 2014