pmxbot IRC Log Viewer

[00:01:26] <pikapp> Hello, I have a document that contains an array of objects that contains an array of other objects. I need to loop through these inner-inner arrays and remove a field from each object in the inner-inner arrays. I have successfully looped through using db.collection.find() but how do I $unset them?

[00:02:11] <pikapp> Ultimately, I can access the objects I need using nested loops inside db.collection.find() but not a query that I can use in db.collection.update()

[00:03:00] <pikapp> Is there a way to perform an update inside the db.collection.find() function without using a query?

[00:18:20] <Freman> so, in mongo, I can call db.log_outputs.find({task_id: "550f3b73cd3b101400377d18"}, {_id : 1, date : 1, duration: 1, pid: 1, code: 1, server: 1, failed: 1}).sort({date: -1})

[00:18:45] <Freman> but mongoish can't call LogModelInstance.find({task_id: this.task._id}, '_id date duration pid code server failed', {sort: {date: -1 }, limit: limit}, function(err, docs))

[02:16:15] <Axy> Hey all, I have an object structure like {"_id":"uuid",source:{"link":"http://","desc":"description","content":{"p1":"a","p2":"b"}}}

[02:16:34] <Axy> how can I queue p1's as an array

[02:16:41] <Axy> only p1 objects in each entry

[02:17:15] <Axy> I tried doing .find({},{p1:1}) did not seem to work

[02:20:49] <cheeser> try "source.content.p1"

[02:38:01] <Axy> cheeser, thanks! that worked! I didn't know it could be queried like that!

[02:38:12] <Axy> What if my content is an array with n number of elements,

[02:38:20] <Axy> how can I get the last element only?

[02:39:26] <Axy> {"_id":"uuid",source:{"link":"http://","desc":"description","content":[{"name":"one"},{"name":"two"},{"name":"three"}]}}

[02:39:32] <Axy> something like this cheeser ^

[02:44:27] <mantovani> hi

[02:44:51] <mantovani> how I do a mongoexport like "find({},{"user.name" : 1})"

[02:44:52] <mantovani> ?

[02:45:14] <mantovani> -q '{},{"user.name" : 1}' isn't working

[02:45:22] <mantovani> returns all fields

[02:45:24] <mantovani> :/

[02:51:16] <joannac> mantovani: http://docs.mongodb.org/manual/reference/program/mongoexport/#cmdoption--fields

[02:51:38] <mantovani> I saw now

[02:51:46] <mantovani> mongoexport -d BigData -c TwitterStreaming --fields '_id.,"user.name"'

[02:51:48] <mantovani> doesn't work

[02:52:07] <joannac> why is there a dot after _id?

[02:52:17] <joannac> why are there quotes around "user.name"?

[02:52:33] <mantovani> I removed

[02:52:39] <mantovani> mongoexport -d BigData -c TwitterStreaming --fields '_id,"user.name"'

[02:53:09] <mantovani> doesn't work either

[02:53:23] <mantovani> but it does work, "db.TwitterStreaming.find({},{"user.name" : 1})" in mongo

[02:53:25] <joannac> mantovani: pastebin?

[02:53:30] <mantovani> yes

[02:53:51] <mantovani> http://paste.scsys.co.uk/491319

[02:54:00] <mantovani> wired

[02:54:16] <joannac> mantovani: um, remove the quotes

[02:54:28] <joannac> like I told you like 5 lines up

[02:54:38] <mantovani> if I do, I get all fields :)

[02:54:46] <joannac> pastebin

[02:55:10] <joannac> I call shenanigans

[02:55:46] <mantovani> http://paste.scsys.co.uk/491320

[02:56:14] <joannac> sigh

[02:56:17] <joannac> not those quotes

[02:56:21] <joannac> the ones around username

[02:56:26] <joannac> user.name

[02:56:46] <mantovani> same result

[02:57:04] <mantovani> http://paste.scsys.co.uk/491321

[02:58:14] <joannac> mantovani: oh, you didn't read the docs

[02:58:15] <joannac> For JSON output formats, mongoexport includes only the specified field(s) and the _id field, and if the specified field(s) is a field within a sub-document, the mongoexport includes the sub-document with all its fields, not just the specified field within the document.

[02:58:33] <mantovani> crap

[02:58:50] <mantovani> this pseudo-database doesn't work pipe to do mongoexport/mongoimport

[02:58:59] <mantovani> now I can't do a simple export with a specific field

[02:59:03] <mantovani> ..........

[02:59:51] <joannac> also for the record, those two results are not the same

[03:00:14] <mantovani> both don't come with more than the specific fields

[03:00:17] <mantovani> ops

[03:00:24] <mantovani> both don't come with the specifics fields

[03:00:50] <joannac> what are you trying to do?

[03:01:17] <mantovani> I just need the field "user.name"

[03:01:21] <mantovani> and the "_id"

[03:01:39] <mantovani> I'll have to use an driver and cursor to do it in any language

[03:01:57] <mantovani> very inefficient

[03:02:39] <mantovani> s/an/a/

[03:04:18] <mantovani> mongodb is very nice for process high volume of "unique transactions( like query something or update)"

[03:04:37] <mantovani> but for throughput is very very very very bad

[03:04:55] <mantovani> and it doesn't support pipe, I had to put an extra driver just do to export/import

[03:05:59] <mantovani> in any database you can use named pipe "mkfifo" and use ssh or any socket to import/export from differents machines without use extra space

[03:06:23] <joannac> mantovani: 2 questions

[03:06:30] <joannac> 1. how did you determine pipes don't work?

[03:06:44] <joannac> 2. why can';t you use csv instead, since you only want 2 simple fields?

[03:07:10] <mantovani> joannac: 1 - trying and reading the documentation

[03:07:24] <joannac> mantovani: okay, cos I just tried it and it worked fine

[03:07:51] <mantovani> seriues ?

[03:07:52] <joannac> you're talking about mongoexport ... | mongoimport ... ?

[03:07:53] <mantovani> serius ?

[03:07:54] <joannac> yes

[03:07:56] <mantovani> yes

[03:08:02] <joannac> works fine for me

[03:08:18] <mantovani> mongodexport -d Dir/

[03:08:20] <mantovani> worked ?

[03:08:25] <mantovani> where inside Dir are pipes

[03:08:25] <mantovani> ?

[03:08:29] <joannac> what?

[03:08:39] <mantovani> collectios()

[03:08:41] <mantovani> foo,bar

[03:08:50] <mantovani> mkpipe Dir/foo.bson

[03:09:01] <mantovani> mkpipe Dir/fbar.bson

[03:09:10] <mantovani> mongodexport -d Dir/

[03:09:11] <mantovani> worked ?

[03:09:27] <joannac> I have no idea what that means.

[03:09:44] <mantovani> sorry

[03:09:47] <mantovani> isn't mongoexport

[03:09:50] <mantovani> is mongodump

[03:09:56] <joannac> ...

[03:10:11] <mantovani> http://stackoverflow.com/questions/24439068/tar-gzip-mongo-dump-like-mysql

[03:10:40] <mantovani> https://jira.mongodb.org/browse/SERVER-4345

[03:11:17] <mantovani> whatever

[03:11:21] <mantovani> if I use csv it does work ?

[03:11:25] <joannac> no

[03:11:28] <mantovani> just to specific fields

[03:11:31] <mantovani> ?

[03:11:42] <joannac> okay

[03:11:44] <joannac> step back

[03:11:46] <joannac> and start again

[03:12:00] <joannac> we spent 10 minutes talking about mongoexport and then you switched to mongodump

[03:12:03] <mantovani> to export just two fields, like "_id,user.foo" csv does work ?

[03:12:11] <joannac> for mongoexport, yes

[03:12:13] <mantovani> using mongoexport

[03:12:14] <mantovani> ok

[03:12:23] <joannac> the only issue is type conversion

[03:12:31] <mantovani> I can import it after using mongoimport ?

[03:12:36] <mantovani> can I import it after using mongoimport ?

[03:12:48] <mantovani> (without have to convert)

[03:13:00] <joannac> yes

[03:13:03] <mantovani> nice

[03:13:06] <mantovani> sorry, I'm drunk

[03:13:16] <mantovani> now backing at mongodump

[03:13:22] <joannac> don't drink and code

[03:13:54] <mantovani> zcat /opt/backup/mongodump-2014-12-03/accounts.people.bson.gz | mongorestore --collection people --db accounts -

[03:13:58] <mantovani> ok, I saw now

[03:14:02] <mantovani> I was wrong

[03:15:14] <mantovani> joannac++

[03:15:22] <mantovani> mongodb is nice

[03:18:15] <mantovani> joannac: sorry for be drunk

[03:18:17] <mantovani> xD

[03:26:06] <joannac> mantovani: np. glad we finally got there :)

[03:32:41] <tehgeekmeister> it seems like mongo is spending a whole ton of time in write locks (almost all writes are to the same database, but over many collections). I'm queuing up a fair number (mean is like 50) of writes. mongo is the bottleneck so far in this system, so I need to figure out what to do to optimize it. where should I look?

[03:33:17] <tehgeekmeister> (we're only at 15-60MB/s of write, far from the capacity of the disk most of the time. averaging like 15-20% disk utilization, as measured by iostat.)

[03:33:38] <mantovani> :)

[03:33:44] <mantovani> you need give way way way way way more information

[03:33:59] <tehgeekmeister> i have so much more I can give. where would you like me to start?

[03:34:01] <mantovani> like, what kind of operations are you doing ?

[03:34:06] <tehgeekmeister> almost exclusively writes

[03:34:14] <mantovani> what are the queries ?

[03:34:15] <tehgeekmeister> might actually be updates, can check that

[03:34:29] <mantovani> well if it is update will be slow, there is no magic

[03:34:39] <mantovani> in any RDBMS

[03:34:59] <mantovani> mongodb is sparse, updates and searchs are heavy

[03:35:29] <tehgeekmeister> crap. they are updates. they don't really need to be, but they are.

[03:35:37] <mantovani> if you do a query without predicate you can have high throughput

[03:35:39] <tehgeekmeister> (i'm just managing this app, didn't write it, hence not knowing that yet.)

[03:35:48] <mantovani> like how I'm doing now

[03:35:57] <mantovani> mongoexport --csv -d BigData -c TwitterStreaming --fields '_id,user.name' > twitter_names.csv

[03:36:01] <mantovani> 1193 be/4 mongodb 205.92 M/s 0.00 B/s 0.00 % 2.95 % mongod --confi~c/mongodb.conf

[03:36:56] <mantovani> if you have cpu and memory left, you can use more threads

[03:37:18] <tehgeekmeister> we're also hitting page faults, but that's probably only because of the updates

[03:37:24] <tehgeekmeister> not tons of page faults

[03:37:25] <tehgeekmeister> just some

[03:37:39] <mantovani> no magic, sorry

[03:38:14] <mantovani> like I said if you have cpu and memory left you can use more threads

[03:39:35] <mantovani> tehgeekmeister: if you need performance in bulk updates operations you should use Teradata, DB2 which has merge update

[03:39:59] <tehgeekm_> yeah, I don't expect magic. I just had been assuming they were inserts, because by my understanding they should be.

[03:40:00] <mantovani> and you can share data across many data using hash partition or whatever

[03:40:12] <tehgeekm_> i'm glad you mentioned it, because i wouldn't have thought to check otherwise.

[03:40:22] <mantovani> I had a problem like this in a costumer, he was doing a lot of updates using an ETL tool

[03:40:30] <mantovani> was taking 12 hours, I did the same using merge update

[03:40:36] <mantovani> took 5 minutes (not kidding)

[03:40:43] <mantovani> but was DB2

[03:41:06] <tehgeekm_> we're literally guaranteed that this data will always be unique by the time we do the update

[03:41:15] <tehgeekm_> no idea why we'd do an update

[03:41:40] <mantovani> if is "unique" insert if not update ?

[03:42:00] <mantovani> and the new data shoulded be "unique" ?

[03:42:03] <mantovani> something is wrong

[03:42:14] <tehgeekm_> i think what's wrong is that i misunderstood this app

[03:42:25] <tehgeekm_> gotta dig into it more

[03:43:02] <mantovani> tehgeekm_: how many fields has each column ?

[03:43:45] <tehgeekm_> varies widely, some are low (5-10 ish) some are higher (50 ish?) don't think any are huge

[03:43:53] <mantovani> if you suppose to has an unique field in each new record

[03:44:46] <mantovani> and you have this insane updates, if you should check your logic how you are doing right now

[03:44:50] <mantovani> :)

[03:45:03] <mantovani> and you have this insane updates, you should check your logic how you are doing right now*

[03:45:18] <tehgeekm_> I think they may not be insane during normal workload, but we're trying to backfill a couple years worth of data

[03:45:31] <tehgeekm_> basically, activity data streams in

[03:45:34] <tehgeekm_> and for many of those activities

[03:45:36] <tehgeekm_> we do an update

[03:45:50] <tehgeekm_> would be better to not block the backfill on those updates, and just do them all at once.

[03:52:36] <mantovani> tehgeekmeister: does your app need the update to be done in real-time or can be batch processing ?

[03:52:39] <mantovani> http://docs.mongodb.org/manual/reference/method/Bulk.find.update/

[03:52:56] <tehgeekmeister> we could totally do batch, but we'll have to rewrite a decent chunk of this

[03:53:07] <tehgeekmeister> i'd already been looking at that previously

[03:53:08] <tehgeekmeister> thanks

[03:54:52] <mantovani> tehgeekmeister: you would love merge update

[06:04:03] <aps> I'm looking for an automated way to backup a prod mongo sharded cluster of data size ~600GB. Is there a script available for this or what is the preferred method to do this?

[06:19:15] <prabampm> Guys, is there any way to restrict mongodb RAM usuage? i have goggled a lot and tried cgroup as well to limit the process memory but nothing seems to be working.. Application becomes dead slow after few days and sometimes getting "cannot allocate memory" error. As an interim solution, we are clearing the cache every one hour.. Any ideas to go forward???

[06:26:42] <Boomtime> mongodb uses (by default) memory mapped files - almost all of the memory used for these remains available to other processes to take

[06:27:06] <Boomtime> when this out-of-memory condition occurs, can you capture the output from ps -aux?

[06:44:10] <prabampm> Boomtime: thanks and i ll do that

[07:21:51] <Jonno_FTW> i need help with a mapreduce, I have a colleciton of documents with readings:[{count:1234,id:12},...], I want to know the proportion of all readings with a count > 2000, readings has different length in each document

[07:26:14] <Boomtime> you probably can't get a "proportion" in a single query, but you can get a count of the number of documents with that criteria easily enough

[07:26:48] <Boomtime> sorry, a count of the number of array entries in total with that criteria

[07:27:33] <Boomtime> i would suggest a sequence like this: $match, $unwind, $match, $group

[07:27:54] <Boomtime> 1. $match filter to those documents containing at least one array entry with count > 2000

[07:28:05] <Boomtime> 2. $unwind the readings array

[07:28:21] <Boomtime> 3. $match the same criteria again, to filter out those unwound which don't match

[07:28:29] <Boomtime> 4. $group all, note the count

[07:53:57] <pschichtel> Hi! Is there an mongodb equivalent to MySQL's "INSERT INTO table (id, numer) VALUES (1, 10) ON DUPLICATE KEY UPDATE numer=number+10" ?

[07:57:24] <Derick> pschichtel: nope

[07:58:07] <Derick> pschichtel: if you want auto-generated unique IDs for documents, just rely on the default _id generation

[08:03:06] <pschichtel> Derick: no I don't. My collections holds objects with a unique key and some counters. I want to either insert a new object if the key is not found or increment one of the counters by one

[08:04:19] <Derick> oh, hmm, let me think

[08:05:25] <Derick> you can just do : db.col.update(( { id: 1 }, { $inc: { counter: 1 } }, { upsert: true } );

[08:05:45] <pschichtel> that should do the trick?

[08:05:54] <Derick> this will increase counter by one if a doc with id=1 exists, if not, it will create a document with {id: 1, counter:1}

[08:05:54] <pschichtel> I'll try

[08:06:18] <Derick> i made a typo though, I used ( twice instead of once

[08:07:16] <pschichtel> not a problem, I won

[08:07:24] <pschichtel> 't copy it^^

[08:18:21] <Boomtime> be aware that an update matched against a unique indexed field with upsert:true has a most curious possibility (due to concurrency) of generating a duplicate-key exception - be prepared to re-try the operation in that event

[08:18:50] <burhan> are writes guaranteed/immediate in mongodb?

[08:20:19] <Boomtime> you want to define both of those terms - for write durability guarantees, you should read up on WriteConcern - the word 'immediate' is difficult to even define in a network environment, so what do you mean by that?

[08:20:55] <Boomtime> if you have received a success reply to a write that you issued, then the write is definitely visible to all clients

[08:21:13] <Boomtime> however, it is entirely possible that clients see the write result before you receive the reply confirming it

[08:22:08] <burhan> in my application, there are scheduled writes (either and update, or an insert) for documents; and I want to make sure that these writes are immediately available to all clients. Right now, there is only one node, but eventually there will me many read-only nodes. My concern is how to prevent the read-only nodes from having stale data?

[08:22:37] <Boomtime> what is a read-only node?

[08:23:04] <burhan> a node from which clients only read data, but don't write or update.

[08:23:21] <burhan> a slave

[08:23:24] <Boomtime> you need to define that in the terms of mongodb - is this a replica set?

[08:23:33] <burhan> yes.

[08:24:58] <Boomtime> reading from secondaries means you read stale data, this is a fact, it is physically impossible to avoid since there is an undefinable delay between primary and secondary - if you want to ensure always-everytime-absolutely-guranteed-up-to-date data then you read from the primary only - which is the default mode for this reason

[08:25:33] <Boomtime> under most circumstances, the secondariers are milliseconds-at-most behind the primary

[08:25:53] <Boomtime> but if those milliseconds mean something to you, then you must not read from them

[08:26:17] <Boomtime> if you want to scale your reads, you should look at sharding

[08:26:31] <Boomtime> replica-sets are for redundancy and durability

[08:27:21] <Boomtime> this says it better than i ever could: http://askasya.com/post/canreplicashelpscaling

[08:41:17] <pschichtel> Derick: I got my stuff running now, thank you.

[08:43:04] <Derick> cool :-)

[08:44:44] <burhan> How do you write a statement to fetch documents that do not match a regular expression?

[08:45:16] <Derick> I think you can use $not

[08:45:36] <Derick> db.col.find( { $not: { field: /regex/ } } );

[08:45:44] <Derick> warning: that will not ever use an index

[08:48:38] <burhan> @Derick: "Can't canonicalize query: BadValue unknown top level operator: $not"

[08:53:34] <Derick> always get this wrong :P

[08:55:40] <Derick> burhan: http://pastebin.com/GAe8p57p

[08:57:51] <burhan> okay, that one was easy - the really complicated one that I am trying to figure out is how to do a query like "if any of the sub-documents in a collection match a condition"

[08:59:00] <burhan> I have a document describing customers, and a list of other documents describing purchases. So I need to find all customer that have at least one purchase where the product code is 543.

[09:00:07] <burhan> for example: {'name': foo, purchases: [{'id': 543, name: "foo widget", price: 54}, ....]}

[09:00:44] <Derick> that's the purchase document?

[09:00:54] <burhan> no, that's the customer document.

[09:01:21] <burhan> the customer document has the puchases embedded.

[09:01:28] <Derick> do, db.customer.find( { 'purchases.id': 543 } );

[09:02:05] <Derick> however, I think you should split that up into separate purchase documents, instead of an array embedded

[09:02:22] <burhan> how would I query across them all though?

[09:02:33] <Derick> in your current schema?

[09:02:47] <burhan> no in the new schema where the purchases are separate documents.

[09:03:08] <Derick> well, the same?

[09:03:19] <burhan> wouldn't that be a join?

[09:03:25] <burhan> which I thought were not possible.

[09:05:08] <Derick> you do two queries

[09:05:22] <Derick> one to find all the customer "ids", and one to find the extra customer info

[09:06:05] <burhan> is that recommended? I thought it was better to have a nested document rather than multiple documents.

[09:07:25] <Derick> how many purchases are you going to get? that's an array that's going to grow grow (hopefully)

[09:07:50] <Derick> perhaps you might hit the 16mb limit? and, if you're using the mmap storage engine you end up moving the document around a lot too

[09:08:13] <Derick> nested documents like this are mostly ok if it's not data that's going to change a lot - I suppose

[09:08:35] <burhan> change means "added on" or "updated" - or it doesn't matter?

[09:09:57] <Derick> added on perhaps more than updated - updating nested documents in more than two levels is not easy to do

[09:51:35] <bilkulbekar> I have a mongo sharded cluster, with two shards, three replicas in each, is it a good thing to have mongos running on each of it?

[10:22:57] <burhan> is there a preferred GUI for mongodb?

[10:23:10] <burhan> I should say recommended

[10:24:27] <_rgn> robomongo is pretty good i guess

[11:07:57] <Axy> Hello channel. I have my documents in this structure: http://pastebin.com/Te1KqJWS -- I want to be able to pull the url's of te last "image" - how do I pick the last object in the array via find?

[11:08:01] <bilkulbekar> I have a mongo sharded cluster, with two shards, three replicas in each, is it a good thing to have mongos running on each of it?

[11:08:06] <Axy> I tried slice without success, maybe someone can direct me

[11:09:32] <Axy> I can do .fond({},{"images.url":1}) and only return url's but that is not for the "last image "only

[11:09:39] <Axy> getting the url of the last image is what I need

[11:16:27] <BlackPanx> is there a way to speed up mongo dump process ?

[11:25:46] <burhan> Axy: $slice: -1

[11:30:52] <Axy> burhan, can you give me a .find() example pelase

[11:30:56] <Axy> you can change mine

[11:31:12] <Axy> .find({},{"images.url":1})

[11:31:25] <Axy> where/how do I slice it burhan

[12:15:58] <Axy> is there any quick way to upgrade mongodb to the final release?

[12:16:07] <Axy> http://docs.mongodb.org/manual/tutorial/upgrade-revision/ makes it look like it's rocketscience

[12:17:22] <Derick> what version are you on now?

[12:17:45] <Axy> Derick, 2.6.6

[12:17:48] <Axy> I'm on digitalocean

[12:17:54] <Axy> (ubuntu)

[12:18:18] <Derick> first:Always backup all of your data before upgrading MongoDB.

[12:18:41] <Axy> Derick, I did mongodump - that's enough?

[12:18:50] <Derick> secondly, you should be able to : stop mongodb; put new binaries in place; start mongodb — providing you have a single node, and no other complicated setups

[12:19:08] <Derick> Axy: you should always verify your backups - but yes, that should be enough

[12:19:17] <Axy> no my setup was pretty straightforward, I followed a tutorial (I'm very new with all this)

[12:19:29] <Derick> do you use authentication?

[12:19:34] <Axy> no

[12:19:59] <Derick> then yeah, stop mongodb; apt-get update mongodb (or however you installed it); and start it back up again

[12:20:05] <Axy> I was toşd that no other people can reach thedb on my server anyway

[12:20:09] <Axy> so I did not do auth

[12:20:12] <Derick> okay

[12:20:20] <Derick> you should verify that though yourself :)

[12:20:30] <Axy> how?

[12:20:47] <chombium> hi Derik, I have an Ubuntu 12.04 with MongoDB 2.0.4 running and I want to upgrade to the current release. ist that possible?

[12:20:49] <Derick> do you have a mongo client installed locally?

[12:20:56] <Axy> yes Derick

[12:21:03] <Derick> chombium: 2.0.4? ... that's going to take some time and effort

[12:21:18] <Derick> Axy: okay, then do "mongo ip-of-your-digital-ocean-server"

[12:21:24] <Derick> and see if it connects

[12:21:34] <Axy> Derick, does it work with domain name

[12:21:38] <Derick> sure

[12:22:54] <chombium> i thought so, in the upgrade docs i've read that i need to upgrade first to 2.2, than 2.4, 2.6 and then to 3.0. i started with this http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/ but I can not find the packages for the older releases. is there some archive?

[12:23:20] <Axy> thanks Derick

[12:23:37] <Derick> chombium: https://www.mongodb.org/downloads#previous

[12:23:54] <Derick> chombium: but I doubt there are ubuntu packages for these old releases

[12:37:08] <chombium> Derick: Thanks for the link. it seems like a good startpoint. unfortunatelly there are not packages, I couldn't fund ppa as well. so if I manually upgrade it to 2.2 and so on, the data should be preserved without any problems, or?

[12:37:48] <d-snp> chombium: I'd use docker images to do the upgrade

[12:38:06] <d-snp> find or build docker images of each of the mongo versions

[12:38:21] <d-snp> and then just mount your database (after backup ofc) in them one after another

[12:38:32] <d-snp> shouldnt take you more than a few horus

[12:39:20] <d-snp> you need a more recent version of ubuntu to run docker though

[12:40:56] <chombium> d-snp I don't have docker. it's a root server. from the last test i did this works: apt-get install mongodb-10gen=2.2.0 it seems that there are packages after all

[12:41:33] <alexi5> hello

[12:42:59] <Axy> Derick, sorry for asking such beginner questions but, how do I update mongodb binaries?

[12:43:02] <d-snp> ok

[12:43:37] <d-snp> Axy: depends on how you installed them

[12:43:49] <Axy> apt-get update d-snap i am on ubuntu

[12:44:05] <Axy> don'tm ind the beginning of my message

[12:44:29] <Axy> d-snp, is there any way I can check how I installed them?

[12:44:49] <Axy> i just mongodumped everything and did "stop mongodb"

[12:45:43] <d-snp> hm

[12:46:03] <d-snp> if you do dpkg -L mongodb-10gen

[12:46:21] <d-snp> or just dpkg -L mongodb

[12:46:37] <d-snp> and it shows a whole bunch of files, then it's neatly installed

[12:46:51] <d-snp> (using apt-get install)

[12:46:52] <Axy> dpkg-query: package 'mongodb' is not installed

[12:47:08] <alexi5> guys I am new to mongodb and was wondering what applications is type of database more suitable for than RDBMS ?

[12:47:37] <d-snp> Axy: and mongodb-10gen not either?

[12:48:12] <Axy> dpkg-query: package 'mongodb-10gen' is not installed

[12:48:12] <Axy> d-snp

[12:48:12] <d-snp> then you can run `which mongod` to see where its installed

[12:48:30] <Axy> it says /usr/bin/mongod

[12:49:06] <d-snp> eh ok to be sure run dpkg -S /usr/bin/mongod

[12:49:35] <Axy> mongodb-org-server: /usr/bin/mongod

[12:49:53] <d-snp> ah ok so you installed it using a package called mongodb-org-server

[12:50:10] <Axy> haha no idea really, it happened more than 4 months ago

[12:50:17] <Axy> and I was following a tutorial

[12:51:31] <Axy> d-snp, if there is a way to make a clean install or "right way to install" I can try that as well

[12:51:57] <Axy> but if "everything is fine anyway" then maybe you can help me update this, I really do have no idea what I am doing atm

[12:52:07] <d-snp> http://docs.mongodb.org/manual/tutorial/install-mongodb-on-linux/?_ga=1.195084522.796200131.1404994453 <-- try this

[12:52:17] <d-snp> oh wait

[12:52:17] <d-snp> no

[12:52:20] <d-snp> dont do that, wtf

[12:52:22] <d-snp> thats horrible

[12:52:27] <Axy> haha ok

[12:52:42] <d-snp> sorry, I clicked 'on ubuntu' but it took me to the generic linux :P

[12:52:52] <d-snp> http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/ <-- I meant this one

[12:53:05] <Axy> d-snp, so wait, do I delete the old one?

[12:53:10] <d-snp> you don't have to

[12:53:13] <d-snp> it will happen automatically

[12:53:24] <Axy> with my db's as well?

[12:53:34] <d-snp> no

[12:53:37] <Axy> or will I be able to use my existing dbs

[12:53:38] <Axy> ok

[12:53:51] <d-snp> you've your db's backed up right?

[12:54:02] <Axy> yes

[12:54:07] <Axy> but still you know

[12:54:18] <d-snp> simply install the newest version from that guide, and then follow the upgrade guide

[12:54:35] <d-snp> or well, follow the upgrade guide, and use that tutorial for the 'install the latest version' step

[12:55:53] <d-snp> in a professional setting btw, you would run this entire procedure on a staging server, so you don't have surprises

[12:55:58] <d-snp> and minimal downtime

[13:01:20] <Axy> d-snp, I followerd http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/ but it's still the old version of mongod when I sudo start service

[13:12:15] <Axy> Derick, d-snp I followed the tutorials again to update my mongod from 2.6.6 to latest, I'm stuck here http://docs.mongodb.org/manual/tutorial/upgrade-revision/#upgrade-replace-binaries at step 3

[13:12:27] <Axy> how do I "replace the existing mongodb binaries with the downloaded binaries"

[13:13:11] <Derick> Axy: how did you install the original binaries?

[13:13:22] <Axy> I don't really remember

[13:13:30] <Axy> I think I followed a digitalocean tutorial or something

[13:21:47] <Axy> Derick, when I do "sudo apt-get install -y mongodb-org" it says "mongodb-org is already the newest version.

[13:21:47] <Axy> "

[13:22:01] <Axy> however when I start mongo it's v 2.6.6

[13:22:55] <Derick> you need to do "sudo apt-get update" first

[13:23:04] <Axy> I did

[13:23:17] <Axy> I followed the opdate tutorial, the version I download is 3.0.4

[13:25:29] <Derick> hmm,

[13:26:02] <Axy> dpkg -s mongodb-org

[13:26:02] <Axy> is 3.0.4

[13:30:46] <Axy> Derick, I think for some reason files were in use or something, I just did purge and installed everything from scratch

[13:30:48] <Axy> it works now

[13:31:00] <Axy> Anyway, thank you for the directions so far d-snp and Derick :]

[14:03:31] <watmm> Hey, would anyone have some suggestions as to how to avoid an uncaught MongoConnectionException with message Failed to connect to: localhost:27017: Connection refused when pm.max_children is reached?

[14:42:20] <iszak> Is there a limit to the $in function? e.g. 3.5 million limit?

[14:44:35] <Derick> iszak: 3.5million? sounds like you're hitting the maximum document size of 16MB there

[14:44:59] <iszak> Derick: it's a $match $in aggregate, so it's to filter out results, I wouldn't assume so?

[14:45:18] <Derick> the query is a document that is send to the server

[14:45:22] <Derick> 16mb applies there too

[14:46:13] <iszak> 3.5m compressed can't be that much

[14:46:22] <Derick> why do you think it's going to be compressed?

[14:46:41] <iszak> over the wire I imagine it's compressed

[14:47:09] <Derick> it's not

[14:47:32] <Derick> but it also wouldn't matter as it would apply to the uncomprssed document

[14:47:57] <iszak> so if I have all this data in mongodb how do I get access to it with a 16mb limit?

[14:48:12] <Derick> well, each *document* can be 16mb

[14:48:34] <Derick> so it's not a problem for the returned result

[14:48:40] <Derick> it's a problem for the query you send to the server

[14:49:04] <Derick> 3.5million elements in $in, is going to be roughly 16mb of data - that has to be send, as a document (with a 16mb limit), to the server

[14:50:09] <iszak> maybe I misunderstood the 16mb limit, have we got reading on this?

[14:51:17] <Derick> http://docs.mongodb.org/manual/reference/limits/#bson-documents

[14:51:34] <iszak> Derick: thanks

[14:52:18] <Derick> I think you just need to remember that "a query with all of its arguments and options" - is also a document

[14:52:39] <iszak> Derick: no I don't think so, I have not read this yet.

[14:52:51] <Derick> the aggregation query is also similar

[14:53:08] <Derick> mongodb doesn't allow more than 16mb sent to it in a single document

[14:54:30] <iszak> Derick: these are some pretty hard limits I'm seeing, only 32TB of data?

[14:55:17] <Derick> for a isngle DB - you can have a lot of DBs

[14:55:39] <Derick> and that's mmapv1 only, The WiredTiger storage engine doesn't have tha limit I believe

[14:55:39] <iszak> why are there all these limits? the only one I'd imagine is 32-bit limit

[14:55:49] <iszak> when will WiredTiger become the default?

[14:55:50] <Derick> and - at 32TB, you would need sharding anyway

[14:55:56] <Derick> which splits up your data

[14:56:03] <Derick> It is thought that WT becomes default in 3.2

[14:56:51] <Derick> most of the MMAP engine limitations are OS limitations though... not much we can do about it. But, you can scale to multiple shards to get around that.

[14:57:16] <iszak> yeah I'm using replicas atm

[14:57:39] <Derick> a replica is a full copy of your data, so that wouldn't help :)

[14:58:04] <Derick> then again, *one* node with 32TB of data is not easy to do anyway. As you need diskspace for it too.

[15:11:11] <Siamaster> I have an entity called User

[15:11:39] <Siamaster> and I'm gonna implement a chat service, so I thought I'm gonna make another entity called User User

[15:12:05] <Siamaster> where I keep track of invites/blocks/deletions and chat history

[15:12:35] <Siamaster> How would you index such entity so that it's easily queriable

[15:13:10] <Choreetso> hi

[15:13:11] <Siamaster> I feel the Id of such entity should be the id of both users

[15:13:22] <Siamaster> instead of a completely new ObjectId

[15:14:03] <Choreetso> I would like to pass documents from one collection into another, but I'm afraid of id clash

[15:14:19] <Choreetso> is there any way to pass them without the id?

[15:14:31] <Siamaster> generate new Ids?

[15:15:03] <Choreetso> Siamaster, hmmm so, would you do it and then, regenerate ids?

[15:15:23] <watmm> Is it normal for mongo to refuse connections and require a restart when php-fpm hits max.children?

[15:15:35] <Siamaster> yes I think the chance of a clash is very very small

[15:15:55] <watmm> It didn happen before (a mongo upgrade and bind_ip change from 127.0.0.1 to 0.0.0.0)

[15:17:13] <Choreetso> Siamaster, ok, thank you! I think I'll try regenerating ids with this anyway: http://stackoverflow.com/a/8723243/1781612

[15:17:41] <Siamaster> (Y)

[15:17:52] <Choreetso> :)

[15:58:25] <tehgeekmeister> in an aggregation, how do I say "make this field on the result be set this value, if it exists on the input document, or some default value otherwise"?

[15:59:32] <Gemtastic> Hello~

[16:02:25] <Derick> tehgeekmeister: you do that in your application after you've retrieved it

[16:03:21] <tehgeekmeister> hmm, for our use case that is impractical. i can do one aggregation pipeline pass, and a second collscan to reformat the output documents.

[16:03:38] <tehgeekmeister> I can't modify the consumers at this point

[16:03:51] <Derick> you can, but you should do it in your app / API end point

[16:04:00] <Derick> the consumer directly talk to your DB?

[16:04:19] <tehgeekmeister> yes, this is part of data analytics pipeline

[16:04:29] <tehgeekmeister> there are many many analytics flows that consume these collections

[16:04:33] <tehgeekmeister> hence, changing them all is a pain

[16:04:43] <Derick> aggregation pipeline it is then

[16:06:08] <tehgeekmeister> so, if i understand correctly, you're saying that a single pass of the aggregation pipeline will be difficult to use to accomplish the full goal?

[16:06:15] <tehgeekmeister> that's save me a lot of time to know

[16:06:21] <tehgeekmeister> so i just want to make sure I understand

[16:06:50] <Derick> i think you would need two pipeline stages

[16:07:33] <Derick> one $match, and one $project

[16:08:14] <tehgeekmeister> that's fine, two stages is fine

[16:08:19] <tehgeekmeister> i mean one aggregation

[16:08:21] <tehgeekmeister> not a single stage

[16:09:08] <Derick> right

[16:09:24] <Derick> "sing pass" confused me :-)

[16:09:26] <Derick> single*

[16:09:26] <tehgeekmeister> so the gotcha might be in this detail, we select out like four attributes, and group on those. then, the project would be merging those into a single string, that becomes the new _id

[16:09:35] <tehgeekmeister> yeah, overloaded terminology here

[16:09:49] <Derick> i am sure A/F can do it :)

[16:09:55] <Derick> but without example, I can't say much more

[16:10:02] <tehgeekmeister> okay, cool

[16:10:05] <Derick> it's also beer o'clock, so I am outtahere into the nice weather

[16:10:12] <tehgeekmeister> awesome, thanks for the help man

[16:10:19] <tehgeekmeister> you may have saved me much trouble

[16:20:18] <Siamaster> What is worst, data duplication or querying in large collections?

[16:20:49] <Siamaster> I've got a little bit of a decision problem

[16:20:55] <StephenLynx> depends on how much data you are duplicating.

[16:21:20] <StephenLynx> usually if you are pre-aggregating, it is better than querying it.

[16:21:22] <Siamaster> I've got this collection User and another collection ContactRequeswt

[16:21:31] <Siamaster> ContactRequest*

[16:21:39] <StephenLynx> so you are just duplicating the user's id?

[16:22:13] <Siamaster> Everytime the user is looking att their contacts they will see the pending contact requests

[16:22:18] <Siamaster> yes

[16:22:27] <Siamaster> hmm no

[16:22:52] <StephenLynx> I would put that on a separate collection.

[16:23:02] <Siamaster> ContactRequest got dateSent, status, message and dateanswered

[16:23:39] <Gemtastic> Bad design is worse than both large querying and duplicated data 8D

[16:23:40] <Siamaster> I wouldn't be able to index the contactrequests with status pending otherwise right?

[16:24:06] <Siamaster> Gemtastic what do you mean?

[16:24:15] <Siamaster> Design is for querying

[16:25:01] <Gemtastic> designing is for everything from within your application to within your database

[16:25:29] <Gemtastic> Luckily mongo is very flexible so you don't need to make schemas

[16:25:57] <Gemtastic> But designing your collections well is still something one should try imo

[16:26:02] <Gemtastic> :)

[16:27:24] <Siamaster> what I don't like about contactrequest being in it's own collection is

[16:27:36] <Siamaster> I will need to have 2 additional fields

[16:27:53] <Siamaster> another _Id and a sender and a receiver

[16:28:31] <Siamaster> and then always query on sender / receiver when I need to get forexample; all the pending requests received by user

[16:29:16] <Siamaster> when having an embedded collection I can have a boolean indicating if the request was sent/received

[16:29:47] <Siamaster> but then I have to remember to always update two fields when updating it

[16:31:03] <Siamaster> hmm no I think I'll go with collection feels really weird having duplicated message/date and stuff

[18:25:03] <Torkable> is there a way to check for "updatedExisting" on a findAndModify?

[18:25:31] <Torkable> I want to know if the upsert did an update or insert

[18:25:58] <Torkable> on update it seems to return a nice status but findAndModify return the doc

[18:45:08] <hahuang65> joannac: mind if I ask you a quick question regarding CS-21221

[18:47:37] <hahuang65> joannac: :) I posted it on the ticket, nm :D

[19:20:51] <pokEarl> Anyone with some Java/(MongoJack) experience that mind helping me out? https://www.reddit.com/r/learnprogramming/comments/3bnz0y/javamongojack_having_trouble_making_a_pojo_from/ Having trouble going from the MongoDB and back to java :(

[20:01:01] <StephenLynx> anyone here having issues installing mongo from http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/ on trisquel?

[20:01:10] <StephenLynx> does it work on trisquel? it uses apt-get too.

[20:08:52] <rdw> morning. is there a way to make Morphia return JSON API format?

[20:10:36] <StephenLynx> morphia?

[20:10:56] <rdw> Java mongodb mapper

[20:11:14] <StephenLynx> you should ask in the morphia channel, they should know more about it.

[20:11:38] <rdw> oic, thx I didn't know it was there

[20:12:53] <joannac> StephenLynx: if it's not in the list of supported OSes, no guarantees

[20:27:41] <whaley> pokEarl: did you solve it by changing version? It's been a couple of years, but I used to be a mongojack user.

[23:17:47] <Freman> good news everyone, php-mongo leaks memory

[23:27:00] <Boomtime> Freman: do you have a simple reproduction case which shows this?

[23:33:52] <Freman> http://pastebin.com/Fns7P1JY

[23:34:36] <Freman> the collection contains some 45650 odd documents (depending on the time of day), some as big as the 16 meg limit

[23:35:53] <Freman> pecl/mongo is already installed and is the same as the released version 1.6.9

[23:36:43] <Boomtime> now describe the memory leak - how do you measure this? over what period, etc

[23:40:25] <Freman> PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 23 bytes) in Eventing.php on line 139

[23:40:31] <Freman> the most basic of ways

[23:40:57] <Freman> 139 translates to $document = $cursor->getNext();

[23:41:12] <Freman> 28005th record

[23:42:24] <Freman> next run through I add memory_get_usage calls

[23:42:28] <Freman> 27979 35389440

[23:42:28] <Freman> PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 80 bytes) in /Users/shannon/Development/PHP/Eventing/Eventing.php on line 139

[23:42:40] <Freman> dies off 30 entries earlier

[23:42:50] <Freman> (well, 26)

[23:42:56] <Boomtime> line 139?

[23:43:08] <Boomtime> your code snippet doesn't have a line 139

[23:43:10] <Freman> line 139 translates to $document = $cursor->getNext();

[23:43:17] <Boomtime> "translates to"

[23:43:30] <Freman> yeh, my php has a bunch of class declarations at the top of the file

[23:43:37] <Boomtime> no, see you need to prove the memory leak is in the driver and not the hundreds of other lines of code that is yours

[23:43:38] <Freman> it's all idle code doing nothing tho

[23:44:09] <Boomtime> does the code snippet you provided exhibit a memory leak?

[23:44:18] <Boomtime> right now, the answer is no

[23:44:37] <Boomtime> but i would be very interested if you have a dataset to which the answer is yes

[23:44:46] <Freman> shannon@Shannons-MacBook-Pro ~/Development/php/Eventing $ php foo.php

[23:44:47] <Freman> PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 16 bytes) in foo.php on line 9

[23:44:50] <Freman> now it does

[23:45:54] <Freman> the only difference between the php in pastebin and php in foo.php is <?php and the actual connect url (shrub.logging.net is not the real servers name)

[23:45:59] <Boomtime> and this happens with documents that are close to the bson limit ?

[23:46:28] <Freman> possibly, there are definitely documents there that are at that limit

[23:46:37] <Boomtime> that is probably important - but i notice the amount of memory it has "allocated" seems to be incredibly low

[23:47:09] <Freman> yeh it jumps all over the place, and it doesn't die on the same records

[23:48:08] <Boomtime> allowed to use only ~130MB though? given that streaming just one large document in will take at least 20MB, and the cursor will start receiving the next one before you even get to it...

[23:49:13] <Freman> I tried setting batchSize(2)

[23:49:15] <Freman> didn't help

[23:49:45] <Boomtime> not too surprising, the document sizes still mean there are tens of megabytes involveed

[23:50:24] <Boomtime> the fact is you're manipulating very large documents in a very small space - why is the available memory so low?

[23:50:52] <Boomtime> legitimate use of the memory would easily put you over that amount

[23:51:29] <Freman> because it's my dev env (production it's set to 512 but I don't like to push that because there's 200 php processes per machine in production)

[23:52:50] <Freman> so you're saying the library is streaming more data in the background than I'm consuming and it's garbage collection is cleaning up?

[23:53:57] <Freman> my understanding leads me to believe there should be no more than 2 or 3 documents loaded at once, which 128 meg should easily accomodate

[23:54:16] <Boomtime> what i'm saying is that if you know that the library is going to need ~40MB just to be able to provide the mimimal ability to iterate a cursor... having only 130MB available sounds like pushing your luck

[23:54:49] <Boomtime> you're expecting the PHP garbage collector to be awesome at it's job, turns out it might not be

[23:56:58] <Freman> 28152 T8njgLsQB9MZyGkA46j2 208379056

[23:56:58] <Freman> PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 40 bytes) in foo.php on line 10

[23:57:01] <Freman> doubled the memory limit

[23:57:21] <Freman> (the increase of line numbers is a counter variable and an ini_set(), nothing more)

[23:57:36] <Boomtime> can you keep going as test? just to see what it takes to stabilise

[23:58:20] <Boomtime> if there is no upper limit then you should be able to construct a simple dataset, and iterate loop, that shows the problem - then you raise a driver bug

[23:59:00] <Boomtime> if what you're showing is simply an ineffeicency of memory re-use in the zend engine, then it's not really anything we do about it

[23:59:27] <Freman> http://pastebin.com/JjJm4AKM are the documents it slows down on

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 30th of June, 2015