[08:08:10] <girb> I have a mongo master and 1 replica server … At a particular time in a day there are very high read I/O's even though I won't run any queries
[08:08:51] <CipherChen> Hi, does mongodb driver has any support for Golang?
[08:09:04] <girb> this happens every day .. are there any indexes scheduled automatically
[08:29:36] <GothAlice> girb: Do you have any TTL indexes?
[08:30:08] <GothAlice> Those run scheduled deletion runs every minute, and if you accidentally cluster expiry times instead of spreading them out, you can experience unexpected load.
[08:34:17] <kramer65_> I'm using MongoDB from Python to store PDF files. I now want to do some text extraction using the python pdfminer lib, but extracting text from a file taken from mongodb (gridfs) takes about 500x more time than taking the exact same file from the local file system.
[08:34:50] <kramer65_> Does anybody know WHAT the exact difference is between a file taken from GridFS as opposed to the local file system?
[08:35:08] <girb> GothAlice: if I use TTL index also .. that will be a delete operation. but in my case its a high read I/O daily for about 2 hours
[08:35:29] <GothAlice> Before it can delete, it needs to read…
[08:35:47] <GothAlice> However, otherwise no, MongoDB doesn't have multi-hour scheduled tasks. That's application-land.
[08:36:33] <GothAlice> kramer65_: BSON packing/unpacking overhead, network transfer, the fact that files larger than a certain size are chopped up into smaller bits that need to be recombined…
[08:38:31] <kramer65_> GothAlice: Thanks for that. Ehm.. but would you know how that could explain the 500x performance decrease *after* I loaded the file from mongodb?
[08:40:15] <kramer65_> because I guess that once the file is loaded from mongodb, it should be identical to the file from the file system right?
[08:40:21] <GothAlice> kramer65_: Not really, no. Possibly the pipeline is optimized for small reads from file streams and not string slicing. In fact, that's likely it. Pull the file out from GridFS into a TemporaryFile, seek(0) on it and use the file handle if possible.
[08:40:27] <GothAlice> kramer65_: Not if it's a string in RAM.
[08:41:00] <GothAlice> Strings are interned in Python, meaning any string manipulation creates new strings… which can get out of hand if you have multi-megabyte strings.
[08:41:31] <GothAlice> (I.e. repeatedly copying the "remainder" of a string to process into a new string would exhibit slowdown like that.)
[08:42:09] <kramer65_> okay, would you have any tip on how I could pull the file out from gridFS into a temp file and do that seek on it? (I'm kinda lost in this sentence..)
[08:42:12] <GothAlice> If you can't pass in the file handle itself (i.e. read from an open file) use a NamedTemporaryFile, write out to that, then pass the filename in before cleaning up the file.
[08:43:08] <kramer65_> you mean write to the file system and then read back out from that?
[10:42:32] <GothAlice> repxxl: MongoDB supports many rich datatypes. Using these is critical for querying said data. (I.e. when is 2 > 10? When "2" and "10" are compared.)
[10:42:52] <GothAlice> http://bsonspec.org/spec.html is the specification of what can be stored.
[11:40:30] <amitprakash> Hi, i have a collection C which has a list of dbref to another collection D, I want to dump items of collection C and D given a query on C st. instance(D) belongs to some instance(C) which satisfies the query
[11:40:49] <amitprakash> s/belongs to/has a dbref in
[11:46:26] <repxxl> when i storing password hashes like "$2y$10$xVAp1e1qjOpnka7/C5BkL.oCZKVaW6TLAmkPDLUQfcAaHesDMcPE." i have it like string inside mongodb
[12:03:02] <amitprakash> Or how do I derefence a list of dbref back to documents?
[13:13:49] <repxxl> how should be a hash saved into db ? something like "$2y$10$xVAp1e1qjOpnka7/C5BkL.oCZKVaW6TLAmkPDLUQfcAaHesDMcPE."
[13:35:44] <Avihay_work> repxxl: that hash IS a string
[13:37:52] <Avihay_work> you could compress it into a shorter string or a binary format, but it's a matter of cheap space vs cheap computation
[13:38:45] <Avihay_work> so in this case you'd rather pick that which is easier and faster to program and debug.
[13:39:58] <Avihay_work> "cheap space vs cheap computation" = you won't gain much space by compressing, and the compression itself is also easy
[13:41:03] <Nours312> Hy, I found a mistake in mongo php driver : we can save $datas = ['key.forbidden' => 'data] ... but ... we'll have problem to use it ... no ? How and where can I report that ?
[13:57:03] <Naeblis> mongoose provides a .findById() function as well
[14:13:35] <Constg> Helo, I have a question: I have an array of documents like this: a: [{type:1, ord:2}] And I'd like to update it like this: udpate({$addToSet: { a: {type:1, ord:3} } but of course, addto Set will add the doc to the array, but I'd like to tell him that, if type= 1, then update ord, do not create a new one. Is it possible?
[14:53:45] <Constg> Hi Avihay_work, this is what I'm using, but I perform some others operation in it, and my find query is a $or[] operator where I put the query I want to perform in priority. Basically, I look to find for _id: 111 $or a.type:1. So I can't use it to find the index in the a array, because there is some chance that the first operand of the $or (if _id = 111) match, so the other part (a.type=1) will not be used.
[14:56:13] <Avihay_work> if you get a document, with some query, you can use it's _id in the update query
[14:56:33] <Avihay_work> I think I don't understand you
[14:57:15] <repxxl> i have a deep structured document how to insert new fields inside the already created document
[15:00:18] <chetandhembre> hi i added server A to my cluster which is my primary in replica set "R1".. what it i changed my primary in "R1" will it work ?
[15:04:54] <Constg> Avihay_work, I need to limit the call to the database (Working with PHP) so yes, with two queries it's easy, but I'm trying to mak it in one query only ;)
[15:21:11] <repxxl> when i have a document with arrays like 0 => array 1=> array 2=> array how to update with mongodb to actually continue on the last number so it will go 3 => array .. etc
[17:23:14] <Mkop> I'm confused by the mongo authentication model. I'm able to connect using `mongo dbserver/foo -u user -p` and then `use bar` but I can't `mongo dbserver/bar -u user -p`
[18:38:32] <proteneer> if only Mongo had a way to do multi document transactions
[18:38:37] <proteneer> then it wouldn’t be a complete shit db
[19:04:16] <hicker> If I want to mutate the data before sending a response, can I do that? Or is that a big no no
[19:08:49] <nehaljwani> Hi! After mongodump on server1 and mongorestore on server2, I am not getting the same output for db.getUsers(); . on server1 I see one user and on server2, I see none.'
[19:08:52] <nehaljwani> Could someone please help?
[20:52:31] <repxxl> i have a key "data" that is holding the data like 0 : {}, 1 :{}. 2{}, 3{} i want to add data with mongodb so it continues like 4{} ....
[21:05:07] <repxxl> joannac i was think in mongodb arrays don't exist i was think it works like arrays but is called "multikeys" instead
[21:05:44] <repxxl> joannac so when i hear arrays here in this channel i allways get in mind a programm language php , etc.
[21:06:41] <joannac> no, arrays definitely exist in mongodb
[21:27:15] <repxxl> how to findOne when i have arrays like 0 => , 1 => . 2 => and i need find something that is inside this arrays.
[21:47:48] <windsurf_> I need to determine the distance between two points (point == lat/lng) and be able to search for points that are x distance from origin point. I’m trying to understand some of the terminology...
[21:48:05] <windsurf_> I have documents currently with lat and lng fields
[21:48:28] <windsurf_> it looks like I need to change these into a nested field, we’ll call it ‘loc’
[21:48:59] <windsurf_> and I think the structure needs to be this { type: "Point", coordinates: [ -73.97, 40.77 ] }, where Point and ‘coordinates’ are reserved names with meaning to mongodb right?
[21:51:21] <windsurf_> and when and why do I need to do, db.collection.ensureIndex( ?
[22:27:40] <repxxl> how to find something in multiple arrays and the keys are named numbers i using $push to create the arrays ..