[00:45:06] <bufferloss> how should I model joined data in a document store?
[00:49:13] <bufferloss> like what if I have a typo, such as "Californea" and I need to change that to "California"
[00:49:29] <bufferloss> if it's in a bunch of places, literally, then I have to search and replace in lots of documents in lots of fields, right?
[00:49:39] <bufferloss> is it still reasonable to use "joins" for this type of thing?
[00:49:45] <bufferloss> like a "states" collection?
[00:53:18] <Boomtime> bufferloss: consider that you appear to be trying to optimize an irregular condition - sure, typo's occur, but unless they make up the majority of your operations you shouldn't be too concerned with what it takes to perform - optimize the _common_ operations
[00:54:02] <bufferloss> Boomtime, well, so then lets say it's not typos
[00:54:17] <bufferloss> lets say there's legitimately some times when, not for a typo reason, an update in one place must be made in others
[00:54:59] <bufferloss> Boomtime, also, I'm working with or have encountered in my problem domains some data structure approximately similar to this: http://pastie.org/9834385
[00:56:21] <bufferloss> Boomtime, so, for example, if an agent alias changes on the level of available aliases, then that change should reflect down to the agents
[00:57:34] <bufferloss> Boomtime, here's maybe a slightly better example: http://pastie.org/private/2mc9nz4e8lsygevgmstq
[00:57:54] <bufferloss> Boomtime, lets say that "white spider" is a call sign that means something important, e.g. when it's used, the agent is to pick up a package at a given location
[00:58:14] <bufferloss> then lets say "white spider" is discovered by the enemy, so the call sign is changed to "brown recluse" or something
[00:58:29] <bufferloss> Boomtime, I would need/want to change the "aka" field of anything that had been white spider
[00:58:58] <bufferloss> I'm also, leaving out a tad bit of data here, such as for example a flag indicating the usage/purpose of the call sign for example, but I'm just saying I come across very similar things in some of the problem domains I work with
[00:59:31] <bufferloss> just, the need to have some kind of "join style" behavior that usually tends to become cumbersome if I need to maintain a running record/log/knowledgebase of all the places that a thing may need to be updated
[01:00:03] <bufferloss> Boomtime, so is it still generally recommended that I just make those updates? or is it reasonable to use something similar to a foreign key and a join as I might have used in a "traditional" RDBMS
[01:13:40] <Boomtime> bufferloss: although a "join" is not possible, you can still use references and resolve them yourself all you like and, yes, it is reasonable, sometimes even unavoidable
[01:14:09] <bufferloss> Boomtime, ah ok, what are references, and by that, I mean, where would I find the docs on that? :)
[01:14:17] <Boomtime> my point before was about challenging the assumption that storing the same data twice is a bad thing
[01:14:48] <Boomtime> bufferloss: a reference is a generic term, a placeholder for whatever you choose to use to refer to a link between data
[01:15:11] <Boomtime> there are no joins in mongodb
[01:15:28] <bufferloss> Boomtime, ok, so within the same query I can't cobble together a document that comes from two separate collections right?
[02:11:15] <jayjo> Is there a functional difference between running a designated database server on AWS or Azure, or hosting a VM that solely runs a server daemon?
[02:17:18] <jayjo> If my goal is to run a designated mongo server, and AWS and Azure don't give that option (but they do give the option for SQL, Postgres, MySQL), am I losing out on something by spinning another instance that's only purpose is to host the databse?
[02:17:38] <jayjo> Are the other options optimized in some way
[02:19:19] <Boomtime> jayjo: you can either deploy your own system, as with any database product, or get somebody to host it for you, as with any database product - i'm not sure what difference you are looking for
[02:19:53] <Boomtime> one would hope that any hosting provider which hosts an instance for you does so with some knowledge of how to optimize it
[02:26:40] <jayjo> Boomtime: great resource, thanks. It looks like they're running Amazon Linux. I guess my question is less mongo-specific, but I am using mongo. Is there an advantage to using the services in the link you sent me or deploying your own system? Other than standard sys admin tasks, is the machine optimized for databases?
[02:27:14] <jayjo> Other than eliminating the admin tasks from what you're required to do, I mean
[02:28:03] <Boomtime> i have no idea for those ones, those are the AWS hosted directly images - but the same is true for MSSQL/Oracle/etc.. how much are the images on AWS optimized for these? and how do you know?
[02:29:05] <Boomtime> to the question: "can you do better by optimizing a machine yourself" almost certainly the answer is yes in every case
[02:30:02] <Boomtime> the machines you purchase from providers aren't optimal for your case - they are the middle of the road that is most applicable for most use cases
[02:30:41] <Boomtime> you don't purchase hosting for optimization, you purchase it for convenience
[02:31:47] <jayjo> But it seems that the machine images that have the database hosted are just VMs that save your from installing and setting up your databse yourself.
[02:32:19] <jayjo> base level they are all just EC2 instances, solely running the databse
[02:32:32] <Boomtime> sure, what else would you expect?
[02:33:30] <jayjo> I don't know, I guess. I'm learning.
[02:33:57] <Boomtime> when you purchase hosting you do so on specification (cpu, ram, disk) - regardless of how that specifciation is met you should get what you paid for
[04:36:37] <here4thegear> whenever I type mongod at command line I get an error addr already in use and it closes down. Is there a way to allow multiple connections?
[04:38:09] <cheeser> mongod starts the server. mongo is the client.
[09:48:40] <davy> hi guys, i have the following data : {"data" : "test", "details" : [ {'a' : 'a', 'b':'b' }, {'c' : 'c', 'd':'d' }]}, is there a way to add a new key in every hash i have into the details array ?
[10:09:33] <ammbot> Anyone know erlang client that support replica-set ?
[10:18:24] <gregf_> i need to install the latest mongo client on ubuntu 12.04
[10:19:25] <gregf_> i've followed some steps from here: http://www.mkyong.com/mongodb/how-to-install-mongodb-on-ubuntu/
[10:19:47] <gregf_> i previously had version : 2.0.x
[10:20:27] <gregf_> after following the steps mentioned in the link(uninstall) it got upgraded to 2.4.12
[10:20:51] <gregf_> i need the latest one 2.6.x. please if someone could help
[10:20:52] <trepidaciousMBR> What happens if I use a Java driver version higher than the version of the database I'm connecting to?
[10:24:52] <trepidaciousMBR> Ah, in fact it looks like there is no actual correspondence between the Java driver version and Mongo version - driver is at 2.12 and mongo at 2.6?
[12:53:21] <Raffaele> hello. I'm making a change to cross-compile for MIPS. I hit a compilation failure because it tries to use posix_fallocate() which isn't supported by the uclibc for my target platform
[12:54:36] <Raffaele> I have a couple of ways of fixing this but I was wondering if there's a need to discusss these options or I just do the change and once I'll submit the patch it'll get reviewed
[13:35:27] <cheeser> i think you're pre-worrying about a problem that doesn't exist
[13:37:35] <olso> robomongo cunfused me i think, it returned all data from collection with .find()
[13:37:46] <olso> I think that this http://stackoverflow.com/a/24222084 is what im looking for cheeser
[13:40:43] <cheeser> well, that's different from how most of the other drivers work. java and c#, e.g., just iterate the cursor. they don't store the entire result set in memory.
[13:41:00] <cheeser> i'm not *entirely* sure that's what http://mongodb.github.io/node-mongodb-native/api-generated/cursor.html#each is saying either.
[13:41:37] <cheeser> i think it's just saying that it holds the entire batch in memory which would be consistent with the other drivers.
[13:59:23] <Lujeni> Hello - can i add an downtime or ack a specific host or alert througt MMS or the public API ? thx
[14:21:27] <MadWasp> hello guys, i’m using mongodb with spring-data-mongodb repositories. my problem are some inserts in my application, they take 33-40ms per insert. is that a normal behavior?
[14:25:29] <MadWasp> i’m doing them in a loop but when i save them all at once, the time it takes is just the sum of all the single times
[15:04:06] <StephenLynx> it was to the dude that thought he had to iterate through his collection in the code
[15:05:11] <MadWasp> can somebody help me with my problem? :(
[15:06:20] <StephenLynx> my suggestion is to stop using spring
[15:06:33] <StephenLynx> and go vanilla with mongo.
[15:06:40] <StephenLynx> try that and benchmark it.
[15:12:48] <MadWasp> i took some further looks and setting writeConcern to UNACKNOWLEDGED or ERRORS_IGNORED fixes the problem. is that a good solution?
[15:30:57] <kakashiA1> hey guys, does mongoose have any kind of build in promises?
[15:44:10] <StephenLynx> no idea. tried asking in their channel?
[15:46:43] <royo1> Hi, I have a question regarding indexes and aggregation in mongodb.
[15:46:43] <royo1> Say I have a collection which stores documents that contain a polygon representing an area, as well as another embedded document with name, financial transactions and other information (with raw size between 256kb and 1mb which need indexes). Say net profit. And May want to search for net profit for polygons that my coordinate falls within. Will I be able to create an index or combination of indexes that cover the queries if I am using the db.a
[16:34:58] <Guest37025> for anybody who has some advice in terms of mongo
[16:35:13] <Guest37025> I am in the process of creating a geospatial collection that will need to scale
[16:35:34] <Guest37025> I want to look embedded documents ranging from 256 k to 5 mb in each of the geospatial documents
[16:35:58] <Guest37025> My fear is over time there will be so many documents that when I do a geospatial search based on all points from a given point that the read query will become extremely slow
[16:36:11] <Guest37025> I want to know how well mongo scales with geospatial indexes with embedded data
[16:36:20] <Guest37025> and how the performance works compared to doing 2 round trips in mongo
[16:36:49] <Guest37025> does postgis scale better and read a lot faster than mongo
[16:37:12] <Guest37025> the problem with postgis is i would do a query to postGis for one round trip to get ids
[16:37:25] <Guest37025> then deal with the latency of going to mongo for a second roundtrip to get info based on these days
[16:37:36] <Guest37025> any suggestions on how to set up this data and which is better is appreciated
[16:38:01] <LB|2> can you provide a sample data of how the proposed collection will look like?
[16:54:06] <syadnom> hi all. mongo newbie/moron here. need some quick help. I have collection with 9 objects in it. I want to output just 2 pieces of data from each object. db.camera.find().pretty() gives me everything, I just want the 'host' and the 'UUID' fields.
[16:54:36] <LB|2> the second parameter in the find will allow to project the fields you want
[17:08:17] <LB|2> however, there was a way to correct that but unfortunately the solution escapes me at the moment
[17:09:30] <syadnom> alright, next bit. I have this query: db.recording.find({},{cameraUuid:1,endTime:1}) the end time is in 'numberlong', how can I get this to output in a 'date' format?
[17:10:04] <LB|2> are you familiar with javascript?
[17:10:05] <jiffe> looks like repairDatabase will do it, but I would need more than 50% available space
[17:17:36] <LB|2> the second one is the correction
[17:18:54] <syadnom> I may just do this in shell. I'm already way outside my knowledge in mongo (obviously). I'm just doing a sanity check on these numbers to see if date is giving me proper output
[18:29:00] <StephenLynx> so you wish all messages the logged in user received
[18:29:04] <StephenLynx> and group them by sender?
[18:29:12] <Tyler_> I'd like all the conversations
[18:30:13] <Tyler_> so when I do find({recipient: logged in user and sender: person A OR recipient: person A & sender: logged in person}), that gives me 1 conversation
[18:30:22] <Tyler_> and I'm trying to get a list of conversations
[18:30:56] <Tyler_> and I'm learning how to use the aggregate function to do that
[18:33:14] <StephenLynx> so you want the messages the logger user received
[19:32:31] <DrLester> Im adding lots of records in parralel into mongodb and sometime those records will have the same name and if its the case i don't want to add them. In other word id like the name to be unique. is there a way to do that in mongodb ?