[02:20:46] <dimon222> mongo is not supposed to be used as in-ram cache service. It mostly relies on hard drive speeds since it usually stores huge amounts of data (when RAM is obviously not enough)
[02:20:57] <cheeser> in fact, the mongo docs discourage using is it a cache layer
[02:21:00] <dimon222> if you want in-RAM, then look for memcached or redis
[02:22:25] <dimon222> that may be rough, but i would call Mongo is some kind of solution for non-relational data warehouse
[02:30:25] <FFForever> So if it's for non-relational data I would still be using it in conjunction with postgres or miriadb, right?
[02:30:53] <FFForever> Or would I just map the document?
[02:31:51] <cheeser> if you need another db like that, you're probably doing it wrong.
[02:32:23] <dimon222> i'm not sure what do you mean for that
[02:32:45] <dimon222> so you just want cache layer?
[02:35:48] <dimon222> prolly this page may help little bit with use cases
[03:23:04] <pikapp> I can’t seem to connect to mongodb remotely. I am pretty sure my firewall is open is on port 27017. What else could I be doing wrong?
[03:26:46] <bmillham> My bad pikapp 27017 is correct
[03:27:07] <pikapp> ok cool. i wouldn’t know any different :)
[03:28:35] <pikapp> cheeser: Boomtime: it was indeed the binding, thank you, fixed now
[03:30:09] <pikapp> which is odd because the default is 127.0.0.1 when installing from yum on centos, so you would think one of the 3 tutorials i am using would mention it as a setup step
[03:33:13] <pikapp> it was just leading me down the iptables road and that wouldn’t have been good, so glad I asked here first
[09:11:52] <winem_> good morning. I'm a bit confused and need some help. I downloaded mongodb 2.6.6. tar.gz for linux from the website and tried to add an user with createUser but it says it's not a function. but the deprecated addUser works
[09:17:53] <winem_> can anyone explain why an addUser works in version 2.6.6?
[09:22:13] <winem_> damn, it's a version conflict between the client and the server
[09:40:31] <hmsimha> if you have a field you want to either increment or set to a value if upserted, can that be done atomically?
[09:42:05] <hmsimha> for example if a user is given $100 then I want to either increment that user's piggybank field by 100 or set it to 100 if that user doesn't already have a document
[13:21:54] <swaagie> Are there issues with mongodb 2.6.6 and debian with geoNear queries? Got a script that runs fine on my local ubuntu machine but fails on some debian server claiming the special index: 2d is missing, while my debug statement after ensureIndex returns: Ensured index ran, result: location_2dsphere (no errors
[13:24:42] <swaagie> the query I'm running is just a basic: `{ location: { $near: { $maxDistance: 5E4, $geometry: { type: 'Point', coordinates: [lng, lat] }}}` as its 2dsphere I don't get why it complains about missing 2d indexes in the first place
[14:01:07] <jiffe> can that be manupulated? it seems that after starting a replica member it crashes soon after it starts to sync. I found a related issue where someone had the same condition and it was based on detecting data corruption
[14:02:17] <jiffe> although they did a full resync to detect the problem which suggests the corruption was on the side of the member that was crashing
[14:02:30] <jiffe> full resync to resolve the problem that is
[14:03:18] <jiffe> a full resync is really a last ditch option in my case though because there's just too much data and these machines should be mostly in sync already
[14:04:34] <jiffe> I don't know if the comment that mongodb shuts itself down to prevent further damage is correct either
[14:04:59] <jiffe> a bus error 'Invalid access at address' doesn't sound like a graceful shutdown
[14:05:56] <jiffe> so really I have no idea what the problem is here
[14:07:05] <jiffe> the optime on the working member is two operations higher than the non-working member
[14:07:22] <jiffe> I don't know if I can just try to ignore those two operations
[14:28:11] <stava> Doing db.collection.update({...}, {$pull: {myArray: 'foo'}, $addToSet: {myArray: 'bar'}}) I get the error "Cannot update 'myArray' and 'myArray' at the same time". But how then can I remove "foo" and add "bar" to myArray in a single update()?
[14:31:02] <stava> I thought that I atleast would be able to nest some $-operators to accomplish something like this
[14:31:21] <cheeser> seems like arrays have certain constraints.
[14:31:36] <cheeser> you can definitely do multiple updates at once. but maybe not on arrays...
[14:32:04] <stava> {myArray: $concat: [{$pull: ...}, {$addToSet: ...}} is what i imagined :)
[14:32:31] <stava> or $diff, but whatever I see what you're saying
[14:32:39] <stava> It's not the first time i struggle with arrays :)
[14:35:22] <swaagie> Would anyone know if there are issues with mongodb 2.6.6 and debian with geoNear queries? Got a script that runs fine on my local ubuntu machine but fails on some debian server claiming the special index: 2d is missing, the query I'm running is just a basic: `{ location: { $near: { $maxDistance: 5E4, $geometry: { type: 'Point', coordinates: [lng, lat] }}}`
[14:35:22] <swaagie> as its 2dsphere I don't get why it complains about missing a 2d index in the first place
[14:35:46] <cheeser> what version is on your local machine?
[14:52:23] <StephenLynx> but the overall impression I am getting from searching is that indeed is a mongo deficiency, you can't perform multiple updates on sub arrays in a single query.
[14:54:20] <swaagie> cheeser: got it working, thanks for the help, silly me, appearantly the reinstall did not kill the current older running mongod, somehow the client did connect like it was working normally/regularly tricking me into thinking 2.6.6 was running
[14:55:10] <jiffe> well my crashing does seem to be due to rsyncing, if I shut the other replica member down then the member that crashes doesn't crash until I start the other member up again
[14:55:41] <StephenLynx> swaagie probably it happens because from what I heard linux keeps the files in memory. unlike windows, it won't lock files that are in use by the system
[14:56:14] <swaagie> StephenLynx: that would fit the symptoms perfectly
[14:56:47] <StephenLynx> lesson learned: makes sure stuff is shutdown for good when uninstalling/reinstalling :v
[15:00:50] <jiffe> you can still have the contents of that file in memory with no references
[15:02:09] <FunnyLookinHat> I'm not sure what I should be googling for - feel free to suggest the keywords I'm missing... but I want to create a report that looks for a nested key/value pair within a collection... what would that be called?
[15:02:47] <FunnyLookinHat> i.e. { 'someObj': { 'someKey': 'someVal' } } - I want to look through data on 'someKey' - if it exists.
[15:06:13] <FunnyLookinHat> I wasn't sure if I should be generating a view or something... ( totally new to NoSQL )
[15:06:43] <StephenLynx> but I wouldnt add fields that would hold objects.
[15:06:53] <StephenLynx> would create complexity without actually adding anything.
[15:07:15] <StephenLynx> for one I only have objects in arrays, where said objects makes sense.
[15:07:27] <StephenLynx> I aim for the most plain model I can.
[15:08:34] <FunnyLookinHat> StephenLynx, Yeah - this field is for arbitrary data that's specific to the use case... it's a one-off compared to the other top-level fields I have
[15:09:04] <FunnyLookinHat> StephenLynx, i.e. https://github.com/funnylookinhat/tetryon/blob/master/spec/particles.json
[15:09:11] <FunnyLookinHat> Feel free to tell me if I'm doing things totally wrong./
[15:09:22] <FunnyLookinHat> gosh that syntax highlighting is awful
[15:17:34] <FunnyLookinHat> I can't use the built-in ID as it's not unique enough for my use case ( has to be client generated and nearly-guaranteed unique )
[15:17:57] <StephenLynx> I know, I don't use the build in id either.
[15:20:41] <FunnyLookinHat> StephenLynx, thanks for the feedback and help :)
[15:21:31] <StephenLynx> and as I said, I would remove the data field and just add the fields to the object.
[15:21:35] <StephenLynx> there is really no gain in doing that
[15:23:10] <FunnyLookinHat> StephenLynx, it doesn't help to keep things organized at all? these will be user-customizable - so I was thinking it would just muddy the collections
[15:23:22] <FunnyLookinHat> The top level fields are the only non-customizable ones
[15:23:36] <StephenLynx> you are just going to add more verbose on queries
[15:23:56] <StephenLynx> you might as well set categories in your documentation
[15:24:07] <StephenLynx> and say "these are the non-customizable fields"
[15:24:14] <StephenLynx> but don't apply it to your model.
[15:28:21] <StephenLynx> I took that lesson from python.
[15:28:43] <StephenLynx> even though I don't like the language, I like bits of it's philosophy
[15:32:52] <FunnyLookinHat> StephenLynx, yeah good pt.
[15:33:24] <mrx1> I will be storing data logs from different devices (~100 devices) and there will be many data points (let's assume 10M in total). I would like to be able to quickly retrive contents of last data point for each device. How to do it nicely? I could store last record of each device in devices collection in a separate field, but this would require updating 2 fields when inserting records - so it wouldn't be atomic and I would have to mimic transactions. It's p
[15:33:24] <mrx1> ossible to do, but mayeby there are some other possibilities?
[15:34:01] <mrx1> and by quickly I mean less then 0.5 sec spend on db query on average pc
[15:34:44] <mrx1> "this would require updating 2 fields" --> "this would require updating 2 documents"
[15:35:30] <StephenLynx> mrx1 the collection holds objects with A: device ID, B: device logs
[15:35:52] <StephenLynx> then you just get the last element of B where the document matches the device ID you are looking for
[15:38:10] <mrx1> time independent of size of data log collection
[15:39:21] <StephenLynx> how do you know it will not be like that?
[15:39:30] <StephenLynx> you are just asking for the last element of the array all the tim
[15:41:12] <mrx1> Not last element of the array, but last element for each device. One of that elements can be actually last in whole collection, but most of them won't be last :)
[15:43:09] <StephenLynx> in that case it can never be constant
[15:43:28] <StephenLynx> if you have 1 device, you will read 1 element, if you have 10000 device you will read 10000 elements
[15:43:46] <mrx1> it can when there would be some mechanism to store log for each device during inserting that log
[15:44:05] <mrx1> but probably there is no such thing which would be atomic
[15:44:07] <StephenLynx> but that what I told you to do
[15:44:17] <StephenLynx> have an array in the document holding an array of logs for that device
[15:44:29] <StephenLynx> so you just read the last element of the array in the returning devices
[15:45:38] <mrx1> I can't hold logs of device in one document, because there can be lot of logs per device (more than 16 MB)
[15:45:52] <mrx1> So I will have collection of logs, with device field
[15:46:37] <StephenLynx> yeah, you can have that, but that would be even more expensive, IMO because then you will have to splice and group.
[15:46:51] <StephenLynx> or can't mongo have documents with more than 16mb?
[15:47:20] <mrx1> yeah, as far as I read, mongo can't have documents larger than 16 mb
[16:38:54] <StephenLynx> db's are just a really different and constrained way to lay out instructions
[16:39:05] <StephenLynx> but it either fails or not.
[16:39:36] <StephenLynx> when it fails but looks like it it's working only to crumble later and you have absolutely no idea what the problem is then you will say "this is hard"
[16:47:58] <sekyms> can you do db.products[objectid].count({"isbn":"00-00-110"})
[16:51:00] <sekyms> yeah that gives me the count of products
[16:51:08] <sekyms> where I want the count of isbn
[16:52:17] <StephenLynx> doing a group is not hard and it seems to be the most optimal way
[16:52:33] <StephenLynx> just make sure you use a match before
[16:52:48] <StephenLynx> so you only unwind the array of the exact object you want
[17:02:08] <rodd> Hi I'm new to mongo, I'd like to setup a document for countries with lat and long, can I user whatever geospatial index I choose or would it be better to use a specific one, like 2dsphere?
[17:03:19] <rodd> or just 2d, maybe 2dsphere is too much?
[17:06:13] <keso_> anybody used morphia for java before? I have a question about the UpdateOperations class
[17:10:42] <StephenLynx> rodd you can use anything for index, you just set it as unique. but personally I would use something human friendly
[17:11:33] <rodd> StephenLynx: hm what do you mean by human friendly in terms of data storage?
[17:12:12] <rodd> StephenLynx: oh right, but I'd like to be able to detect a user's position through lat and long
[17:12:29] <rodd> anyway thanks, I think 2d is enough
[17:12:54] <StephenLynx> if you want to have that everytime you read the user, then yes, I would use that.
[17:13:51] <StephenLynx> if you are ok with making occasional additional queries for that, and want to show something in the GUI regarding the country, then I would use something else.
[17:14:04] <StephenLynx> it varies on how you want to use it.
[17:18:07] <ac360> HI all, I'm a bit new to MongoDB and database infrastructure and I have a fundamentals question. In general, if I have an application running online on a server, I want my Mongo database to be hosted on the same server so that both could communicate quickly without any network latency, correct?
[17:19:31] <ac360> I'm just curious how this works in AWS EC2. If I spin up an EC2 instance with a web application, and then use a managed-mongoDB-provider like MongoLab who claims they can also host the database on AWS to "host the database where my app is".
[17:20:09] <ac360> How does my app connect to that database over the most optimal connection?
[17:20:44] <ac360> (Sorry for the vague language. I just don't know a lot about the connection protocol through which apps connect to the databases)
[17:21:14] <ac360> Amazon Web Services EC2 is a cloud computing service which lets you spin up computing instances you can use for launching web applications
[17:21:18] <kali> ac360: they will allow you to pick an ec2 zone, iirc. so the database will not run on the same server, but it will not be very far
[17:21:46] <kali> ac360: it's not that common to have the DB and the app on the same hardware actually
[17:21:53] <StephenLynx> that is very specific to their services. you would get more people to help you contacting them or in their forums
[17:22:10] <ac360> kali: Thanks. So Every query to the database is actually a network request and is subject to network latency?
[17:22:22] <ac360> It's just a "close network request"?
[17:25:32] <ac360> I ask because I'm intrigued by AWS's new Lambda service. It allows me to simply upload javascript functions to the cloud and uses the most minimal amount of computing resources to process it. I'd like to write javascript that connects to a Mongo Database, but I don't know the best way to set up the infrastructure to be as fast as possible.
[17:27:09] <kali> as far as i understand it, lambda is not for low latency problems (like answering http request), more for high bandwith problem (async jobs, generating stats, thumbs, etc.)
[17:27:22] <StephenLynx> ac360 if performance is an issue maybe you could rent servers and handle them via SSH
[17:27:57] <StephenLynx> minimal solutions are usually the fastest.
[17:28:35] <ac360> Kali: Right, but I need persistance to run those jobs
[17:30:39] <ac360> StephenLynx: That's what I do, but I'm looking to stop paying monthly fees for server infrastructure and instead pay only for the precise infrastructure required when my code is actually run. This is what AWS Lambda represents.
[17:30:58] <ac360> It will result in a significant cost savings
[18:14:56] <jiffe> can you skip operations when replicating?
[18:42:34] <jcox1> hey everyone. I'm looking for some advice on configuring replica set members. The docs seem ambiguous regarding what I want to do so I figured I'd check here to see if anyone has experience
[18:43:07] <mrx1> Is it OK to use username as object id in Users collection? I will have only 1 user with any name and I won't change usernames. But that username will be less random then autogenerated ID. Is it okay or will it introduce some performance penalty or something ?
[18:43:41] <cheeser> your ID can be whatever you need it to be. but mongo won't let you change the _id value of a document.
[18:48:49] <jcox1> I have replica sets with three members (1 prim., 2 sec). I want to add a node that I am solely going to use for dumping data into our data warehouse. I'm going to configure this member to be hidden (and priority:0) so it doesn't receive any read traffic. But now I have 4 voting members in the replica set. So I either need to make the new member non-voting or add an arbiter. Ideally I'd just make it non-voting, but the documentation seems to discourage changing
[18:50:22] <cheeser> simple enough to add the arbiter
[18:54:17] <jcox1> @cheeser are there disadvantages to having the new member be non-voting? I'll be deploying this setup to many many replica sets across many mongo clusters. So if there's no disadvantage, it'd be cheaper to have the warehouse nodes not vote.
[20:04:46] <dman777_alter> much...not sure how this happened but I have a duplicate collection. In otherwards, I have 2 foos...one is (empty) bytes and the other is 2 gigs
[20:04:54] <dman777_alter> how do I get rid the of the empty one?
[21:06:21] <digicyc> You sure they have the same exact name?
[21:06:42] <digicyc> I can't even figure out how to add a collection with the same exact name of an existing collection. o_O