PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 24th of December, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:38:35] <ramfjord> Sup guys? Is it possible to write to a secondary node in a replica set?
[00:39:15] <ramfjord> our app (with write operations) was configured to use a node which stopped responding for a moment and became the secondary
[00:39:31] <ramfjord> so it seems that it has been doing so, despite the documentation saying you can't write to secondaries
[00:42:04] <cheeser> no. all writes go to a primary
[00:42:27] <ramfjord> so if it's writing, the driver is somehow figuring out what the primary is
[00:43:07] <cheeser> the driver always knows the primary
[00:43:17] <cheeser> except during an election/split of course
[00:43:45] <ramfjord> excellent, thanks for the info
[00:45:07] <cheeser> np
[06:00:35] <Neo9> Hi folks
[06:00:46] <Neo9> Good morning
[08:06:32] <girb> new to mongoDB .. help me
[08:08:10] <girb> I have a mongo master and 1 replica server … At a particular time in a day there are very high read I/O's even though I won't run any queries
[08:08:51] <CipherChen> Hi, does mongodb driver has any support for Golang?
[08:09:04] <girb> this happens every day .. are there any indexes scheduled automatically
[08:29:36] <GothAlice> girb: Do you have any TTL indexes?
[08:30:08] <GothAlice> Those run scheduled deletion runs every minute, and if you accidentally cluster expiry times instead of spreading them out, you can experience unexpected load.
[08:34:17] <kramer65_> I'm using MongoDB from Python to store PDF files. I now want to do some text extraction using the python pdfminer lib, but extracting text from a file taken from mongodb (gridfs) takes about 500x more time than taking the exact same file from the local file system.
[08:34:50] <kramer65_> Does anybody know WHAT the exact difference is between a file taken from GridFS as opposed to the local file system?
[08:35:08] <girb> GothAlice: if I use TTL index also .. that will be a delete operation. but in my case its a high read I/O daily for about 2 hours
[08:35:27] <girb> and also at same time
[08:35:29] <GothAlice> Before it can delete, it needs to read…
[08:35:47] <GothAlice> However, otherwise no, MongoDB doesn't have multi-hour scheduled tasks. That's application-land.
[08:36:33] <GothAlice> kramer65_: BSON packing/unpacking overhead, network transfer, the fact that files larger than a certain size are chopped up into smaller bits that need to be recombined…
[08:36:46] <girb> GothAlice: ok
[08:38:31] <kramer65_> GothAlice: Thanks for that. Ehm.. but would you know how that could explain the 500x performance decrease *after* I loaded the file from mongodb?
[08:38:56] <kramer65_> Stackoverflow question here: http://stackoverflow.com/questions/27616827/difference-between-reading-from-file-and-mongodb-gridfs
[08:40:15] <kramer65_> because I guess that once the file is loaded from mongodb, it should be identical to the file from the file system right?
[08:40:21] <GothAlice> kramer65_: Not really, no. Possibly the pipeline is optimized for small reads from file streams and not string slicing. In fact, that's likely it. Pull the file out from GridFS into a TemporaryFile, seek(0) on it and use the file handle if possible.
[08:40:27] <GothAlice> kramer65_: Not if it's a string in RAM.
[08:41:00] <GothAlice> Strings are interned in Python, meaning any string manipulation creates new strings… which can get out of hand if you have multi-megabyte strings.
[08:41:31] <GothAlice> (I.e. repeatedly copying the "remainder" of a string to process into a new string would exhibit slowdown like that.)
[08:42:09] <kramer65_> okay, would you have any tip on how I could pull the file out from gridFS into a temp file and do that seek on it? (I'm kinda lost in this sentence..)
[08:42:12] <GothAlice> If you can't pass in the file handle itself (i.e. read from an open file) use a NamedTemporaryFile, write out to that, then pass the filename in before cleaning up the file.
[08:43:08] <kramer65_> you mean write to the file system and then read back out from that?
[08:43:30] <GothAlice> Aye.
[08:43:44] <GothAlice> It's almost 100% likely that manipulation of large strings is the cause of your slowdown.
[08:44:01] <GothAlice> (I.e. the library extracting the content is badly written.)
[08:44:10] <kramer65_> can't I write to a "file in memory" (if that exists) so that I avoid actually writing to the file system?
[08:44:28] <anonym> guys can someone take a look at this pastie: http://pastie.org/9797089
[08:44:52] <anonym> I need to add a value to my embedded array or create the object that will hold the array
[08:45:12] <anonym> I'm not sure what modifier to use or even if it is possible to do it atomically
[08:47:09] <kramer65_> GothAlice: Would you know if I can use StringIO to write the file in memory and then read out from that?
[08:57:47] <GothAlice> Apologies for not being able to assist further.
[09:00:31] <kramer65_> Alright, thanks a million anyway. I think I got a lot further!
[10:04:43] <repxxl> all i save in mongodb is saved as string ... also my numbers is this correct?
[10:41:41] <GothAlice> repxxl: No
[10:42:32] <GothAlice> repxxl: MongoDB supports many rich datatypes. Using these is critical for querying said data. (I.e. when is 2 > 10? When "2" and "10" are compared.)
[10:42:52] <GothAlice> http://bsonspec.org/spec.html is the specification of what can be stored.
[11:40:30] <amitprakash> Hi, i have a collection C which has a list of dbref to another collection D, I want to dump items of collection C and D given a query on C st. instance(D) belongs to some instance(C) which satisfies the query
[11:40:32] <amitprakash> How do I do this?
[11:40:49] <amitprakash> s/belongs to/has a dbref in
[11:46:26] <repxxl> when i storing password hashes like "$2y$10$xVAp1e1qjOpnka7/C5BkL.oCZKVaW6TLAmkPDLUQfcAaHesDMcPE." i have it like string inside mongodb
[12:03:02] <amitprakash> Or how do I derefence a list of dbref back to documents?
[13:13:49] <repxxl> how should be a hash saved into db ? something like "$2y$10$xVAp1e1qjOpnka7/C5BkL.oCZKVaW6TLAmkPDLUQfcAaHesDMcPE."
[13:13:57] <repxxl> like string ?
[13:20:01] <repxxl> ?
[13:35:44] <Avihay_work> repxxl: that hash IS a string
[13:37:52] <Avihay_work> you could compress it into a shorter string or a binary format, but it's a matter of cheap space vs cheap computation
[13:38:45] <Avihay_work> so in this case you'd rather pick that which is easier and faster to program and debug.
[13:39:58] <Avihay_work> "cheap space vs cheap computation" = you won't gain much space by compressing, and the compression itself is also easy
[13:41:03] <Nours312> Hy, I found a mistake in mongo php driver : we can save $datas = ['key.forbidden' => 'data] ... but ... we'll have problem to use it ... no ? How and where can I report that ?
[13:41:34] <Nours312> sorry : $datas = ['key.forbidden' => "data"]
[13:44:57] <kali> Nours312: this is right, that is a very poor choice of naming
[13:46:28] <Nours312> I know ... the driver should to throw an exception, no ?
[13:48:54] <Nours312> after in mongo_shell : you can't update datas if you don't rename all fields
[13:51:13] <Naeblis> I'd like to merge 2 different query results and then sort them via time. How could I do that?
[13:51:53] <Naeblis> (I already have the individual result sets sorted)
[13:54:36] <Nours312> @Naeblis you push your second array in the first, sort datas, and use thm ... no ?
[13:55:07] <Naeblis> except both sets have the time property in different structures
[13:56:10] <repxxl> how to find by _id ?
[13:56:13] <Naeblis> I can access one via items[0].creationtime and the other via items[0].foo[0].creationtime
[13:56:29] <repxxl> like find me doc with _id == 'd54sad14asdasd2'
[13:56:52] <Naeblis> repxxl: .find({_id: ObjectId("...")})
[13:57:03] <Naeblis> mongoose provides a .findById() function as well
[14:13:35] <Constg> Helo, I have a question: I have an array of documents like this: a: [{type:1, ord:2}] And I'd like to update it like this: udpate({$addToSet: { a: {type:1, ord:3} } but of course, addto Set will add the doc to the array, but I'd like to tell him that, if type= 1, then update ord, do not create a new one. Is it possible?
[14:50:15] <Avihay_work> Constg: http://docs.mongodb.org/manual/reference/method/db.collection.findAndModify/#db.collection.findAndModify
[14:53:45] <Constg> Hi Avihay_work, this is what I'm using, but I perform some others operation in it, and my find query is a $or[] operator where I put the query I want to perform in priority. Basically, I look to find for _id: 111 $or a.type:1. So I can't use it to find the index in the a array, because there is some chance that the first operand of the $or (if _id = 111) match, so the other part (a.type=1) will not be used.
[14:53:50] <Constg> Do you see what I mean?
[14:56:13] <Avihay_work> if you get a document, with some query, you can use it's _id in the update query
[14:56:33] <Avihay_work> I think I don't understand you
[14:57:15] <repxxl> i have a deep structured document how to insert new fields inside the already created document
[15:00:18] <chetandhembre> hi i added server A to my cluster which is my primary in replica set "R1".. what it i changed my primary in "R1" will it work ?
[15:04:54] <Constg> Avihay_work, I need to limit the call to the database (Working with PHP) so yes, with two queries it's easy, but I'm trying to mak it in one query only ;)
[15:21:11] <repxxl> when i have a document with arrays like 0 => array 1=> array 2=> array how to update with mongodb to actually continue on the last number so it will go 3 => array .. etc
[17:23:14] <Mkop> I'm confused by the mongo authentication model. I'm able to connect using `mongo dbserver/foo -u user -p` and then `use bar` but I can't `mongo dbserver/bar -u user -p`
[18:38:32] <proteneer> if only Mongo had a way to do multi document transactions
[18:38:37] <proteneer> then it wouldn’t be a complete shit db
[18:45:17] <cheeser> ooh burn!
[19:04:16] <hicker> If I want to mutate the data before sending a response, can I do that? Or is that a big no no
[19:08:49] <nehaljwani> Hi! After mongodump on server1 and mongorestore on server2, I am not getting the same output for db.getUsers(); . on server1 I see one user and on server2, I see none.'
[19:08:52] <nehaljwani> Could someone please help?
[20:52:31] <repxxl> i have a key "data" that is holding the data like 0 : {}, 1 :{}. 2{}, 3{} i want to add data with mongodb so it continues like 4{} ....
[21:02:31] <cheeser> why not use an array?
[21:02:36] <joannac> repxxl: why is it a subdocument rather than an ... dammit cheeser
[21:02:47] <joannac> what cheeser said
[21:02:56] <repxxl> joannac cheeser _?
[21:03:25] <joannac> why aren't you using an array?
[21:03:32] <joannac> then you can do stuff like $push
[21:03:40] <repxxl> joannac using array
[21:03:47] <repxxl> joannac u using php with mongodb ?
[21:04:19] <joannac> no. i'm not a dev
[21:05:07] <repxxl> joannac i was think in mongodb arrays don't exist i was think it works like arrays but is called "multikeys" instead
[21:05:44] <repxxl> joannac so when i hear arrays here in this channel i allways get in mind a programm language php , etc.
[21:06:41] <joannac> no, arrays definitely exist in mongodb
[21:27:15] <repxxl> how to findOne when i have arrays like 0 => , 1 => . 2 => and i need find something that is inside this arrays.
[21:47:48] <windsurf_> I need to determine the distance between two points (point == lat/lng) and be able to search for points that are x distance from origin point. I’m trying to understand some of the terminology...
[21:48:05] <windsurf_> I have documents currently with lat and lng fields
[21:48:28] <windsurf_> it looks like I need to change these into a nested field, we’ll call it ‘loc’
[21:48:59] <windsurf_> and I think the structure needs to be this { type: "Point", coordinates: [ -73.97, 40.77 ] }, where Point and ‘coordinates’ are reserved names with meaning to mongodb right?
[21:51:21] <windsurf_> and when and why do I need to do, db.collection.ensureIndex( ?
[22:27:40] <repxxl> how to find something in multiple arrays and the keys are named numbers i using $push to create the arrays ..