[07:17:56] <obiwahn> joannac: why doesn't pymongo convert sets to lists?
[07:31:23] <bluesm> What should I do if my database scheme is clearly "joinable".... That is "I have a Text_decriptors" table which is joined with the "word_descriptors" table...
[07:32:13] <bluesm> (Like text contain words, and words are assigned to specific text) (I need split words since it is language learning app, where every word/phrase needs translation)
[08:47:40] <rspijker> bluesm: Is this a schema question, or… ?
[09:05:51] <rspijker> bluesm: I’d probably just go with a texts collection and a words collection. Your app will probably always be in the context of a specific text. So then you can just handle the ‘join’ in the application layer.
[09:06:10] <rspijker> That is, if you are working with a specific text, you can just make the textId pat of your queries on the word collection
[09:06:32] <rspijker> Don’t think embedding would work well here, due to the translation aspect
[09:09:30] <bluesm> rspijker: Yeah. Just cope with joins myself.
[09:14:55] <bluesm> rspijker: And do you think I should go with joining myself with the "repetitions_words_descriptors" that in normal SQL database have fields : "user_id" (linking to the user) , "word_descriptor" (linking to the word with translation), and "repetition_count", and "easiness factor" .... and other properties of the repeated word (The
[09:14:55] <bluesm> repetitions_words_descriptors are like extended "words_descriptors) ?
[09:16:12] <rspijker> I think you should embed most of these properties, where it makes sense at least
[09:16:22] <rspijker> that’s hard to say though without knowing the exact specifics
[09:17:40] <bluesm> rspijker: Embed in the user? Like array of the users's words in the User document ?
[09:18:09] <rspijker> That doesn’t make much sense to me, but again, I don;t know how your application works
[09:22:12] <bluesm> rspijker: Ok. I'll experiment. Now performance isn't that much of the deal. Thanks for time and effort!
[09:23:04] <bluesm> rspijker: I think that I need create separate document for repetitions_words_descriptors since I'll need to sort them (like by "repetition_count" field)
[09:23:13] <bluesm> rspijker: So again Thank you :)
[09:23:32] <rspijker> bluesm: you can sort the same collection multiple ways of course...
[09:23:45] <rspijker> you can have multiple indices on a single collection, to make it fast
[09:24:36] <rspijker> The main thing is that the way in which the data is stored makes sense to you and makes sense in the way your application uses the data
[09:42:47] <slap_stick> hey, i was reading about replica sets and from what i can tell, you are not able to define a replSet via the configuration file that gets passed in to mongod, but you have to provide it to the mongod as a parameter i.e. --replSet "name" is that really the case?
[09:50:54] <ksinkar> hello I am running on ubuntu 12.04 and have installed mongodb using apt-get install. Is it a good idea to use the standard mongodb package provided by Ubuntu?
[09:51:25] <ksinkar> yes, and the installation from ubuntu does not work out of the box
[11:27:03] <devastor> Any ideas why there's a limit of 1024MB for the chunk size even when manually trying to move a chunk to another shard? How can I move a jumbo chunk bigger than that?
[11:30:12] <cheeser> this might help: https://groups.google.com/forum/#!topic/mongodb-user/YErQ7gu2Ry8
[11:34:09] <nfroidure_> does some of you use mongodb with relationnal datas ? How do you handle constraints, joins as you would with a traditionnal SQL database ?
[11:34:41] <nfroidure_> The same for transactions.
[11:37:51] <devastor> cheeser, I read it already, but it doesn't help , because the chunk can't be split anymore and the mongo does not allow setting the chunksize to bigger than 1024, at 1024 it still says that it's too big
[11:38:10] <devastor> 1024 does sound like some artificial limit, though
[11:39:33] <cheeser> he says he changed his chunk size to 10000MB
[11:40:24] <devastor> cheeser, it worked for him when he set it to 120 instead of 10000 (because anything bigger than 1024 is just ignored), but 120 or 1024 is not enough in our case
[11:42:50] <bmcgee> Hey guys, I just updated my java driver from 2.11.x to 2.12.x and have found a breaking change in the behaviour of tailable cursors. If I add the option QUERYOPTION_TAILABLE the driver automatically adds QUERYOPTION_AWAITDATA forcing the cursor to block for new data when calling next(). This did not happen in 2.11.x and from what I can see in the API there is no way of unsetting the await data option.
[11:45:49] <bmcgee> I meant hasNext() blocks on the cursor now
[11:46:10] <bmcgee> I’ve found why this change happened: https://jira.mongodb.org/browse/JAVA-1091
[11:46:30] <bmcgee> “There are no circumstances where a user would not want to also set Bytes.QUERYOPTION_AWAITDATA, so if the former is set, the latter will now also be set.”
[11:46:46] <bmcgee> I cant think of one use case…….. mine
[11:46:56] <cheeser> keep reading that description...
[11:47:08] <cheeser> the old behavior violates the contract of Iterator
[11:47:49] <bmcgee> bit annoying for my use case though, I wasn’t affected by that
[11:48:30] <bmcgee> I have an actor which tails a collection and on each read exhausts the cursor creating batches. When hasNext() returns false it schedules another attempt at reading a bit later
[11:54:47] <bmcgee> cool, in the meantime I need to make a choice: do I like cursors from aggregation results more than i like a tailable cursor which doesn’t await data….
[11:56:15] <bmcgee> cheeser: any idea when 1255 will be available?
[12:12:17] <Guest10682> I'm afraid, I'm not sure to be at the right place to ask this: I have a polygon that I want to carve up with a predefined number of subpolygon with the same area each. Does anyone have an idea ?
[12:23:44] <remonvv> Although that's actually a rather difficult problem.
[12:25:18] <remonvv> bmcgee : I object to your implication that C++ programmers are the most obvious heroes to solve that (wo)man's mathematical problem!
[12:26:44] <bmcgee> remonvv: as a Java/Scala developer I intended no such implication. What little part of my brain I devoted to processing the question simply flagged the keyword C++, thus the resultant semi-sarcastic remark was a little half baked
[12:27:04] <bmcgee> remonvv: I apologise profusely ;)
[12:29:13] <remonvv> bmcgee: I'm sorry but the damage is done. You are now on my list.
[12:30:30] <bmcgee> remonvv: f***, I will go prostrate in front of the rails devs as punishment…
[12:33:33] <remonvv> bmcgee : I'll admit, albeit reluctantly, that that is a punishment that fits the crime.
[12:35:58] <remonvv> bmcgee : Scala for fun or Scala for pay?
[12:43:48] <remonvv> I've never even heard of that. Lmgt
[12:44:06] <bmcgee> remonvv: it was developed by the JetBrains guys
[12:44:15] <bmcgee> parts of intellij are written in it i think
[12:44:29] <bmcgee> simplest way I could describe it is pragmatic Scala
[12:45:51] <remonvv> Hm, I'll look into it. Sounds fun.
[12:47:55] <bmcgee> cheeser: I just thought I’d be sneaky and try to go around the API with some oh so very hacky reflection manipulation of the _options flag in the DBCursor…… but yeah, sneaky didn’t work :(
[12:52:38] <bmcgee> cheeser: back to 2.11.x I go….
[12:58:48] <remonvv> Kotlin's null safety alone gives me tingles all over
[13:00:32] <bmcgee> remonvv: we have very different triggers for tingly…
[13:07:32] <remonvv> bmcgee: I lead a sad life kind sir.
[13:22:46] <uehtesham90> how can i output the results of a find() to a new collection in the mongo shell??
[13:23:25] <uehtesham90> do i do something like: db.source_collection.find(query).copyTo(destination_collection)
[13:26:18] <bmcgee> uehtesham90: only method I know for outputting the results to another collection is with the aggregation framework
[13:26:58] <uehtesham90> but the problem with that is that i can group by the specific fields, but there it will only output those fields
[13:27:25] <uehtesham90> i also want to output additional fields which are not part of group parameter....is there a way to do that in aggregation?
[13:28:44] <bmcgee> uehtesham90: I was wrestling with a smiliar problem where I would have multiple versions per key I wanted to group by, and what I wanted was the first document for each group id. But I wanted the whole document. I messed around with specifying all the document fields with a $first parameter before I just re-structured the problem to avoid it.
[13:29:12] <uehtesham90> what did u define in the first parameter?
[13:32:09] <bmcgee> I did something like { $group: { _id: { foo: 1, bar: 1 }, strField: { $first: ‘strField’ }, intField: { $first: ‘intField’ } …… }
[13:56:58] <rspijker> why would you need to group if you want to get the results of a find uehtesham90 ?
[13:56:59] <rspijker> just use $match and no $group?
[13:57:12] <rspijker> ‘additional fields of a document’ don’t make much sense in the context of grouping...
[13:58:13] <uehtesham90> so what i want is to get all records that satisfy a certain query and then sort them by some fields and output the result directly to either a new collection or even better a csv file
[13:59:04] <uehtesham90> i dont know how to output the results of my find and sort to csv file. i cannot output to new collection as ill have to use aggregation
[14:02:04] <rspijker> uehtesham90, so how about: db.coll.aggregate({$match:{query}}, {$sort:{“field”:direction}},{$out:”collection”})
[14:02:22] <rspijker> not every aggregation needs to be a group…
[14:02:58] <rspijker> if I want csv output, what I usually do is use robomongo as a client and just use javascript to output csv then copy if into my text editor manually...
[14:03:10] <rspijker> but that of course only works if you don;t need it automated
[14:03:40] <uehtesham90> hey rspijker....thanks alot...ur right....i dont need to group it
[14:21:16] <remonvv> I don't think manually modifying the chunks collection will end well. That collection is managed by mongos and you might cause consistency issues.
[14:21:30] <rspijker> remonvv: different type of chinks buddy ;)
[15:57:35] <ggoodman> I'm going to be doing queries that hit a couple fields via $text, potentially an array field and sorted descending by another field.
[15:58:05] <ggoodman> I've read that there is some nuance in the order I use to define this type of index to maximize performance.. can anyone chime in?
[17:25:55] <nEosAg> anybody having issues with MMS
[17:26:24] <nEosAg> i can't see any agent logs on MMS page
[17:26:54] <nEosAg> i have configured Hosts and crosscheck the instructions 2-3 times..
[17:27:21] <nEosAg> i can even see access logs from "being monitored" servers
[18:16:55] <uehtesham90> can i run two aggregation commands on two different mongo shells without affecting the other.....both commands aggregate from the same collection on the same key but different values
[19:41:15] <hrrld> There's some example code on this page: http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/ that defines a function called "getNextSequence" ... Is this javascript running inside mongo?
[19:50:39] <thebope> Currently we do things like this in the queries in my java application withField("someObject.someField in", someList)
[19:51:00] <thebope> If .someField is a list, can I do "contains" in place of "in"?
[19:51:08] <thebope> and use someVariable instead of someList?
[20:15:06] <smik> I have question about what would be a good way of storing "Following" kind of information. User A is following user B. What I did is that have a 'following' array in for a user and add ObjectID of all users he is following. But how can I determine (efficiently) how many people follow a particular user.
[20:19:35] <smik> Is using aggregation the right way?
[21:12:12] <Pulpie> does upserting take a longer time than updating
[21:26:49] <bob_db> hello, have a question: for text search, do I have to do anything for find to match non-alphanumeric characters?
[21:27:05] <bob_db> this query, for example, does not return results even though I know it should: db.collection.find( { $text: { $search: "+term" } } );
[21:30:50] <Pulpie> is it possible to batch updates together for optimization?
[21:31:38] <Pulpie> such as I have 500k lines to update in a database. It's taking forever.
[21:31:56] <Pulpie> all of those updates are to different collections
[21:32:14] <Pulpie> or not collections but documents. Different documents.
[21:32:41] <Pulpie> Is there a way to batch update in a way I can just update all the documents I need to with one command?
[21:40:23] <bob_db> is it possible to do the text search I mentioned previously without having to do a regex?
[22:04:25] <future_phd> I noticed that I don't have to index my collection by date if I want to query it and have results sorted by date. Should I index anyway, or is this reliable in general?
[22:06:26] <future_phd> basically, im saying that I noticed that documents are stored in the order that they are inserted, and I have a field for the date of insertion, and, so far, I don't have to index by that field
[22:06:44] <future_phd> but I don't know if this could go wrong after sharding or something later on in the db lifecycle
[22:07:13] <future_phd> someone lemme know what you think , I have time. lol
[22:08:04] <kali> future_phd: update may break the ordering
[22:09:03] <kali> future_phd: but if you're using objectIds as _id, you can use the "free" _id index to get roughly a creation date order
[23:00:08] <cirwin> mongorestore on my unloaded mac book pro is proceeding at about 3MB/s — is there any way to speed it up?