pmxbot IRC Log Viewer

[01:49:52] <arrdem> data model design question: in MySQL I understand that it is more "propper" to maintain seperate tables for users, passwords and permissions. This makes sense because in an unabstracted data store you want to querry over the minimum amount of data possible. Is this still the "best practice" for a document store like Mongo?

[02:10:26] <daveluke> do some instances of mongo require a commit or something?

[07:20:46] <Bartzy> how do I cancel a query/command in mongo shell ?

[07:20:51] <Bartzy> Ctrl+C exits the shell alltogether

[07:21:15] <Zelest> it should go "wanna cancel? y/n?"

[07:22:42] <vsmatck> blech

[07:22:47] <vsmatck> Windows UI is shit.

[07:23:14] <kali> Bartzy: you can go back to the shell, and use db.currentOp() / db.killOp() to cleanup the mess

[07:23:19] <vsmatck> Confirm everything. Windows has the passive agressive UI. heh

[07:23:35] <Bartzy> kali: Yeah, but I can't use the current shell ?

[07:23:58] <Bartzy> kali: I have another question. The index size from db.shares.totalIndexSize , is in bytes ?

[07:24:03] <kali> Bartzy: i don't know how the windows shell behaves

[07:24:28] <Bartzy> Also, for a big setup that we're now migrating from MySQL, around 200 million documents, 100GB +- of data - What should I be aware of / read ? :)

[07:24:36] <Bartzy> kali: I'm using linux

[07:26:02] <Zelest> http://www.kchodorow.com/blog/2012/04/20/night-of-the-living-dead-ops/

[07:26:12] <Zelest> was gonna link that as well... but took a while to find it. :-)

[07:26:19] <Zelest> not fully sure if it's related though.

[07:26:41] <kali> Bartzy: yes it's bytes

[07:28:01] <Bartzy> thanks kali.

[07:28:25] <Bartzy> So with the 'mongo' linux shell, if I use ctrl+c, it asks me if I want to end the current operation or something, and no matter what I say, it exits.

[07:28:31] <[AD]Turbo> hola

[07:28:32] <Bartzy> Can I cancel the current op, but not exit ?

[07:48:31] <akaIDIOT> mornin' chaps

[07:48:35] <akaIDIOT> well, over here anyways

[07:49:14] <akaIDIOT> i saw a presentation by a 10gen-guy who briefly mentioned mongodb having a problem with data sizes over 2P

[07:49:23] <akaIDIOT> something about chunk info needing to be in-memory

[07:49:36] <akaIDIOT> anyone got some more insight into that? :)

[07:51:18] <akaIDIOT> think it was this presentation http://www.10gen.com/presentations/mongosf2011/sharding

[07:57:31] <phira> not sure I'd choose mongo for that kind of size stuff

[07:57:43] <phira> seems like the kind of job for big-data software

[07:59:40] <Bartzy> "You may use sort() to return data in order without an index if the data set to be returned is small (less than 32 megabytes in version 2.0, or less than four megabytes in version 1.8 and earlier). For these cases it is best to use limit() and sort() together."

[07:59:42] <Bartzy> This is not true!

[07:59:52] <Bartzy> on a 100 million documents collection, I tried doing:

[08:00:25] <Bartzy> db.things.find().sort({myid:1}.limit(50), returned data set should be a few KB at most, and it takes forever!

[08:01:42] <jondot> hello

[08:02:37] <jondot> i have a use case where I would go over all records in a collection and mark the last time I accessed each. this is done by around 10 workers. so each will grab a subset of the collection, read each record and update each record. i'm wondering how is this fit with mongodb

[08:06:22] <akaIDIOT> phira: thanks, you'd say mongo was not suited for the big data world? :P

[08:07:10] <akaIDIOT> I'm not sure how up to date the talk is versus the current state of mongodb

[08:07:19] <akaIDIOT> details like that prove to be hard to dig up :P

[08:09:02] <Null_Route> Hey Guys!

[08:09:35] <Null_Route> Can any wizards out there help me with an update query to change the second octet of an IP address?

[08:09:54] <vsmatck> How many bits is big data?

[08:10:10] <akaIDIOT> vsmatck: heh, good point :P

[08:10:31] <Bartzy> if I want to order by insert time, I can just sort by _id, right ?

[08:13:29] <vsmatck> Bartzy: Seems like that would work. The highest 4 bytes of _id is time.

[08:14:38] <Bartzy> vsmatck: And the rest of them? This would work for sharded environments too?

[08:15:01] <Bartzy> vsmatck: This is good to do? Or should I just create an index for the ts field I have an sort by that?

[08:15:32] <Bartzy> It needs to be a compound index anyway, so there's no benefit of using the existing single _id index.

[08:15:54] <Bartzy> So if ObjectId is bigger than ISODate, maybe it's better to use the timestamp index ? Does that make sense ?

[09:25:54] <brahman_work> Hi, Am about to create an index on a sharded mongodb cluster. Documentation indicates that the best way to create indexes whilst causing minimum perfomance hit is to run the indexes on each secondary in the replica set and then demote the primary and so on. Is the process the same for a sharded setup?

[09:50:45] <remonvv> A little off topic but is anyone aware of any blogs/articles that have measured Amazon EC2's instance failure rate and such?

[10:23:20] <Bartzy> 'millis' in explain() are valuable only on the first execution ? After that it's in cache or something and returns 0 ?

[10:34:08] <Bartzy> Mon Jul 9 10:34:17 SyntaxError: unterminated string literal (shell):1

[10:37:58] <augustl> I'm storing an array of "users" that has access to a certain document. Is it important that I store actual ObjectIds and not just the hex strings of the ObjectIds?

[10:38:28] <augustl> working with strings seems much easier, that's what I expose through my HTTP API anyway.

[10:39:44] <ron> augustl: it really doesn't matter.

[10:40:29] <Bartzy> ron: Why it doesn't matter?

[10:40:48] <augustl> good, that was my guess as well :)

[10:41:11] <ron> Bartzy: why does it?

[10:41:56] <Bartzy> because ObjectIds are unique? The hex is unique too ?

[10:42:06] <Bartzy> Sorry that is silly

[10:42:15] <Bartzy> But the ObjectIds are maybe more efficient in space ?

[10:43:03] <ron> ObjectIds are more efficient in space? really? :)

[10:43:33] <Bartzy> I dunno :)

[10:43:52] <Bartzy> I have another question: I'm trying to use $in with ~400 values. I get : SyntaxError: unterminated string literal (shell):1 in the mongo shell

[10:43:55] <Bartzy> Why is that ? :|

[10:44:22] <ron> because you're not using it properly? ;)

[10:44:22] <algernon> your JS is missing a " somewhere, most likely.

[10:44:57] <algernon> paste the stuff to something that can syntax highlight JS, and see where the runaway string is

[10:45:49] <augustl> Bartzy, ron: well it might be that they're more efficient in space, a string of hex encoded bytes typically stores as twice as much data since two characters/bytes are used to represent one byte

[10:46:07] <ron> who cares. do whatever feels comfortable to you.

[10:46:32] <algernon> OIDs *are* more compact. That's about their only advantage of storing them as binary instead of hex strings.

[10:47:00] <Bartzy> ron: I just popped it to vim. The highlighting stops in the middle of 1 string!

[10:47:02] <augustl> I suppose querying is slightly faster since mongodb can know it's a pure byte compare, no string comparison fancyness required

[10:47:10] <Bartzy> maybe it's some JS length limit ? :|

[10:47:21] <augustl> obviously now I'm just prematurely optimizing without measuring a single ting :)

[10:47:49] <ron> disk space shouldn't be much of a consideration imho.

[10:47:55] <ron> speed may be something else.

[10:48:03] <Bartzy> disk space - no. RAM and fitting the data into RAM, I think yes ?

[10:49:22] <Bartzy> ron: You have any idea why the JS shell acts like taht ?

[10:49:23] <algernon> augustl: mongodb strings have a length encoded, so comparing them is fairly efficient. besides, in the best case, you compare once (to get the index)

[10:49:46] <ron> Bartzy: considering I didn't really follow your question - no :)

[10:50:18] <ron> augustl: fwiw, we store ObjectIds and not hex strings.

[10:57:54] <augustl> algernon: ah good point, indexes makes it pretty much the same operation anyway I guess

[10:58:16] <Bartzy> I have a design question:

[10:58:20] <Bartzy> I have a collection of photos

[10:58:25] <augustl> ron: I'm talking about giving it a hex string directly, though: users: ["123abc...."], I suppose mongodb doesn't store ObjectIDs in those scenarios?

[10:58:30] <Bartzy> every photo has a uid of the user that took that photo.

[10:59:06] <Bartzy> Now I want to show to a user all the photos of his/her friends, sorted by insert time. I have an index on {uid:1, _id:1} (_id used for sorting by insert time).

[10:59:37] <Bartzy> But if I do uid: {$in : [big list of uid of friends here] }, it takes up to 500ms to the query to execute!

[10:59:44] <Bartzy> Any other options I have ?

[11:19:50] <Bartzy> ron, algernon , any insight on my question about $in ?

[11:20:53] <remonvv> Bartzy, not without changing the schema. $in is an O(n) operation

[11:21:18] <remonvv> Bartzy, I'm assuming you added an index. If not, that will help some.

[11:22:38] <Bartzy> remonvv: I'm willing to change the schema.

[11:22:44] <Bartzy> But I don't know to what.

[11:22:55] <Bartzy> remonvv: And yes, I've added an index: {uid: 1, _id:1}.

[11:23:06] <Bartzy> And it's using uid for the $in and _id for .sort()

[11:25:08] <Bartzy> remonvv: Just checked without the sort, it's much faster. Perhaps the sort is making it slow for some reason ?

[11:29:43] <remonvv> It will, scanAndOrder:true?

[11:56:49] <Bartzy> remonvv: What is that? Why the sort makes it slower ?

[11:57:04] <Bartzy> I'm trying to benchmark the queries with PHP now, because of the large dataset in $in

[11:57:22] <Bartzy> It works, but I don't know what to time ? The cursor returns Immediately, I think ?

[11:59:04] <Bartzy> ah, calling next() on the cursor executes the query

[12:11:33] <Bartzy> remonvv: ScanAndOrder:true :|

[12:29:21] <augustl> are there any atomic operations built-in for moving a document from one collection to another?

[12:29:52] <deoxxa> atomic operations are for sissies

[12:30:22] <augustl> :D

[12:31:30] <Bartzy> I seem to be unable to connect to a mongod instance @ localhost using the mongo shell

[12:31:35] <Bartzy> it's just stuck at "Connecting to test"

[12:31:42] <Bartzy> the port is up

[12:32:34] <augustl> considering moving docs to a separate collection for archived documents, but I suppose setting a flag is more than good enough for now

[12:33:22] <Bartzy> augustl: If it's a big collection with lots of indexes... and you're not using the archived documents, maybe it's better to archive ?

[12:33:26] <Bartzy> I don't really know, just asking ;)

[12:33:32] <augustl> Bartzy: yeah me neither :)

[12:33:38] <augustl> prematurely optimizing, as always

[12:37:00] <Bartzy> It's really weird - I rebooted the server, and the port seems up in netstat , and mongodb is up, but telnet localhost 27017 doesn't work

[12:37:04] <Bartzy> mongod is not accepting the connections

[12:37:51] <ninja_p> Has anyone had much luck testing the new aggregation stuff with the PHP driver?

[12:38:53] <Bartzy> nevermind.

[12:43:22] <augustl> Bartzy: what was the problem?

[12:44:34] <Bartzy> loopback interface was down. stupid :)

[12:45:20] <deoxxa> ...how

[12:55:26] <ninja_p> I can see no way of sending new aggregation pipeline queries using the php driver :(

[13:01:06] <skot> ninja_p: you can run any command you want via the php driver.

[13:01:42] <skot> There may not be a helper in the shell yet but that is just a convenience, not a requirement.

[13:03:51] <ninja_p> yeah, i just got it actually

[13:04:11] <ninja_p> was having issues with the format in which to pass the keys to the argument it takes

[13:04:15] <ninja_p> thanks very much though

[13:29:14] <fredix> hi

[13:29:35] <fredix> is it possible to find and get in one query an embbeded doc ?

[13:29:46] <fredix> with C++ driver

[13:43:15] <NodeX> find and get ?

[13:47:27] <adamcom> I think he means return only the embedded fields, but I could be wrong

[13:48:32] <fredix> adamcom: yes

[13:49:24] <fredix> I try db.payloads.find({'steps._id': ObjectId("4ffadb1b62770d4e13724937")}) but it returns all the payload bsonobj not only the embedded

[13:49:43] <adamcom> you can definitely do it in the shell - http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields

[13:49:57] <adamcom> you need to add a projection to the query

[13:50:23] <fredix> ooh

[13:51:05] <adamcom> specifically you need to exclude _id of the root doc, so {_id : 0, steps._id : 1}

[13:51:09] <adamcom> or similar

[13:51:47] <fredix> ok thx !

[13:53:16] <fredix> db.payloads.find({'steps._id': ObjectId("4ffadb1b62770d4e13724937")}, {_id :0, steps._id :1}) ?

[13:57:22] <fredix> ok this is works > db.payloads.find({'steps._id': ObjectId("4ffadb1b62770d4e13724937")}, {steps: 1})

[14:20:41] <Bartzy> Is that incorrect? http://www.deanlee.cn/programming/mongodb-optimize-index-avoid-scanandorder/

[14:21:43] <Bartzy> I'm using $in on user_id field and sorting by _id. When I use the index {uid:1, _id:1}, I get scanAndOrder:true in explain(), but the query is (pretty) fast. When I use the index {_id:1, uid:1}, the explain() doesn't even end in order for me to see what happens.

[14:53:34] <ninja_p> anybody much experience with the new aggregation stuff?

[14:53:40] <ninja_p> seeing some pretty weird results :(

[15:04:43] <lolwtfyeh> Hey, how can I query an array of ISODates to find all of the timestamps that are in the hour 13?

[15:24:04] <fredix> re

[15:24:19] <fredix> db.payloads.find({"steps._id": ObjectId("4ffaf4b3ee838f6fd4135398")}, {_id: 0, "steps.order": 1}) return { "steps" : [ { "order" : 1 } ] }

[15:24:41] <fredix> is there a way to return only { "order" : 1} ?

[15:37:57] <fredix> I obtain {"order": 1} with tihis awfull code : BSONElement l_order = step_order.getField("steps").embeddedObject().getFieldDotted("0.order");

[15:39:36] <adamcom> you can use $slice to make sure only one value is returned, for example {_id: 0, "steps.order": {$slice : -1}} in the profection (or 1) but I don't think you can isolate just that sub element in the way you mean

[16:06:25] <looopy> how would i go about dealing with a list of embedded documents? like...user has many companies. I have my company document mapped out. so now i'm wondering what's a good way to reference each company a user may create...ListFields looks ok but i'm not sure. I'm using python/mongoengine by the way

[16:17:20] <npa> I have a Mongo 2.0.4 question. I'm writing a Ruby script to call mongodump on a sharded collection. if I lock the cluster according to the manual, and I write to the collection with another script while executing the dump, the dump just hangs. the backup script works fine when I do not use the locks. is there a good way to still create consistent backups with a script without using fsync+lock (besides using journaling and snapshots)? also, has anyone else en

[16:17:20] <npa> countered this sort of problem with fsync+lock before in MongoDB 2.0.4?

[16:26:02] <Sim0n> Hiya. I have a big collection with two compund indexes on field_1_date_1 and field_1_date_-1. And I noticed with explain() than when doing a query "db.col.find({'field':'something'}).sort({'date':1}) the field_1_date_-1 index is used, but in reverse! So my question is can i drop the field_1_date_1? It doesnt seem to be used by mongodb and is taking up close to 6gb.

[16:32:20] <looopy> got it

[16:52:33] <amontalenti> if you create a mongodb hidden secondary (e.g. for a backup) and remove documents from some of its collections, replications of new documents from the oplog will continue to work, right? I am thinking of a use case where we have a secondary that stores "last 30 days of data" as a backup by pruning old documents only on the hidden secondary.

[17:00:15] <freezey> 10gen ready for a beerpong whooping?

[17:05:11] <bjori> would love to show you how the game should be played

[17:05:17] <bjori> ,)

[17:09:49] <jonwage> I have a situation where I $pushAll many elements on to an array, a field inside that array is indexed and when I $pushAll hundreds of items it takes ~2 seconds locking mongod, if I $pushAll in batches of 20 or so will that be better or will I still face the same problem? When I test locally the lock % reports actually being higher when I push them in batches instead of all at once

[17:10:06] <jonwage> that is not what I expected to see

[17:26:11] <jonwage> anyone have any experience with $pushAll and pushing on hundreds/thousands of elements?

[17:32:45] <adamcom> question - did anyone see my response to npa re: mongodump or did my client disconnect me at the worst possible moment?

[17:33:33] <npa> I didn't see it

[17:34:12] <adamcom> ok, re-post imminent

[17:34:49] <adamcom> heh, must be too long, just dumped me of again

[17:35:11] <adamcom> npa: the mongodump is just a read, albeit a long running one - if you have writes waiting, as soon as it yields the read lock for any reason, the write locks will grab it and they are greedy, so will hold the lock until there's a gap in writes - then the readers get another look

[17:35:20] <adamcom> When you fsync & lock you block those writes and that doesn't happen.

[17:35:29] <adamcom> ow are you running the dump though? if you point it at a replica set it will run with slaveOK on by default and you would be reading from the secondaries - might get a bit better performance there, if not you can fsync/lock them without stopping writes on the primary

[17:35:31] <sebastiandeutsch> I just added a new replicaSet Node to my conf. When I take a look at rs.status() is says that the node is recovering, after about 20s something fails "DBClientBase::findN: transport error". Then the uptime is at 0 again and the node is recovering. Is this normal? (Info the db is pretty large ~13gb)

[17:36:36] <adamcom> sebastiandeutsch: the transport error - that from the primary or from the new member?

[17:38:15] <sebastiandeutsch> adamcom: it looks like this: transport error: wla3:27017 query: { replSetHeartbeat: \"watchlater\", v: 7, pv: 1, checkEmpty: false, from: \"wla2:27017\" }

[17:38:29] <sebastiandeutsch> adamcom: wla2 is my primary

[17:39:44] <adamcom> transport error usually means the connection between the primary and the new member is flaky - any socket resets?

[17:41:13] <adamcom> the fact that it happens every time though……can you connect using a mongo shell from the new member to the primary and run commands? (similarly the other way around - mongo shell from primary to new member, run a few basic finds etc.)

[17:41:58] <sebastiandeutsch> adamcom: let me check

[17:46:47] <sebastiandeutsch> adamcom: when I try to run a command it says: "error: { "$err" : "not master and slaveok=false", "code" : 13435 }" I guess I have to wait for a while until it's fully recovered. Just wondering if an error occurs if mongo is so clever and restarts from an incremental position.

[17:47:29] <npa> adamcom: I'm using the Ruby driver to obtain the lock - there's an API call to get/release it. from there I connect to the proper secondary node for both shards and run a dump against them in parallel (I have tried running it serially and ran into the same issue)

[17:50:31] <rydgel> I tried to remove a shard, but the drain stopped. Any way to see what's going on?

[17:55:52] <adamcom> rydgel: run the removeshard command again - what does it return?

[17:56:33] <adamcom> sebastiandeutsch: during normal replication that would be the case - I think the copyDB command is a bit more fragile though

[17:57:19] <adamcom> sebastiandeutsch: if you suspect the network is not up to it, you could try seeding the new member with the data files from another member first - then it would not be copying so much data when it joins

[17:57:21] <rydgel> adamcom: this http://pastebin.com/KJKChKtp

[17:57:40] <sebastiandeutsch> adamcom: then I'll wait and see what's happening… copying 13gb over network can flaky. thanks for your time.

[17:58:25] <adamcom> rydgel: that means it's still draining

[17:58:44] <adamcom> when the 130 chunks are done

[17:59:01] <adamcom> then you will need to move any primaries off with the movePrimary command

[17:59:13] <rydgel> adamcom: I know, but that's like that since 2 months. And yes the balancer is on. And yes chunks are keep moving between the remaining shards

[17:59:14] <adamcom> run remove shard one more time after that and you should be good

[17:59:30] <adamcom> can you do sh.status() for me?

[17:59:42] <rydgel> on the shard itself?

[18:00:05] <adamcom> from the mongos

[18:00:16] <rydgel> oh yes, thePrintShardingStatus

[18:00:19] <rydgel> ok

[18:00:22] <adamcom> yep - shortcut

[18:00:24] <adamcom> :)

[18:02:03] <rydgel> adamcom http://cl.ly/3t1P3W3t3G0U0B3j1k2k

[18:03:59] <adamcom> rydgel: I'm guessing statigramprod.medias is getting a lot of writes?

[18:04:17] <rydgel> adamcom lots of writes/remove

[18:04:20] <rydgel> constantly

[18:04:39] <adamcom> one of the issues with migrations (which is what has to happen to drain) is that a shard can only participate in one at a time

[18:05:12] <adamcom> if you have a lot of writes (and hence splits, migrates) happening, then they are going to compete with the drain

[18:05:37] <adamcom> you can see the symptoms elsewhere too

[18:05:45] <adamcom> .popular is unbalanced

[18:06:13] <adamcom> looks like you have write hotspots on shard0008 and shard0009

[18:07:05] <adamcom> plus you have two relatively new shards on shard0013 and shard0014

[18:07:16] <rydgel> adamcom: Yes I can explain the hotspots, we got an issue where one chunks was unable to be moved. So we stopped the balancer a few day, fixing the issue, and then relaunching it like 2 days ago

[18:08:03] <rydgel> adamcom: ok so you think I will have to wait, waiting medias to be balanced and then the draining will start again?

[18:08:18] <adamcom> right - which will again contribute to the contention - the balancer is now trying to do several things: 1. drain 002, 2. balance .popular, 3. migrate max chunk values around

[18:08:51] <adamcom> in terms of what to do, depends on priorities - you could turn off the balancer, then issue moveChunk commands manually

[18:08:59] <adamcom> that would at least allow you to prioritise

[18:09:11] <rydgel> adamcom: As you can see popular and the other one are already fully drained, but I guess you may be right, I will wait

[18:09:28] <rydgel> adamcom: I thought about doing it manually, but there are a lot of chunks

[18:09:51] <rydgel> and I'm not sure how to do it programatically

[18:10:06] <adamcom> rydgel: sure - but for the drain at least it should be relatively easy - you need to move them all off 0002

[18:10:43] <rydgel> adamcom, yes. So I think I may be able to query the chunks on shard0002 and then move them

[18:10:48] <rydgel> I will think about it

[18:11:07] <rydgel> but for now I will let Mongo do its magic

[18:11:12] <adamcom> if you pass true to the sh.status command it will print all the ranges for you

[18:11:22] <rydgel> great

[18:11:37] <adamcom> since you have two almost empty shards, you could move 65 each to those

[18:11:58] <adamcom> http://www.mongodb.org/display/DOCS/Moving+Chunks

[18:12:51] <adamcom> that would at least get rid of one factor for you

[18:13:03] <rydgel> adamcom: thank you sir

[18:14:18] <adamcom> rydgel: welcome :)

[18:18:10] <revnoah> I'm having an issue with mongoose and manual references. My Event model has a manual reference to a place id. What I would like to do is query the Event model and retrieve the Place title along with the rest of the Event data. I thought the simplest approach would be to use a virtual in the Event and look up the Place by using the findOne() function but the virtual appears to the synchronous while the findOne()

[18:18:10] <revnoah> function operates asynchronously.

[18:18:19] <revnoah> 1) is this even the right approach?

[18:18:52] <revnoah> 2) if not, what is the correct way to get a manual reference using mongoose?

[18:19:53] <aheckmann> revnoah: http://mongoosejs.com/docs/populate.html

[18:21:10] <revnoah> aheckmann: ah, interesting.

[18:23:48] <revnoah> aheckmann: so it only works with ObjectIds though? That complicates things a bit since I'm originally pulling data from a mysql (with a Drupal backend) and I would prefer to simply use those references.

[18:24:52] <aheckmann> revnoah: yes object ids only. so the getters/setters/virtuals are all sync, your next option is to manually set up some helper methods

[18:25:25] <aheckmann> revnoah: yourSchema.methods.whatever = function (cb) {… cb(err, yourResults) }

[18:26:17] <revnoah> aheckmann: yes, which I understand in theory but am very unsure how to go about it.

[18:28:12] <aheckmann> revnoah: back to the object ids, they can be either a string, number or objected _id type

[18:28:21] <aheckmann> revnoah: if that still doesn't work

[18:28:45] <aheckmann> revnoah: then you'll have to manually do your queries and pass a custom result set back

[18:30:01] <aheckmann> revnoah: what is your "place id"? is it a string, number, objected, or buffer?

[18:30:11] <revnoah> aheckmann: it's an int

[18:30:17] <revnoah> or Number

[18:30:19] <aheckmann> revnoah: you can still use populate() for that

[18:30:27] <revnoah> aheckmann: okay, excellent

[18:30:32] <aheckmann> revnoah: sorry for the confusion

[18:31:17] <revnoah> aheckmann: that's quite alright. The confusion is mostly with the vague documentation.

[18:31:33] <aheckmann> revnoah: its being rewritten now

[18:33:21] <aheckmann> revnoah: btw we also have the #mongoosejs channel

[18:33:25] <revnoah> aheckmann: Stellar. Are you the primary author? I see you contrib quite a bit. For the most part, things are great and easy to use, and you certainly have my thanks for not only maintaining it but also giving help in irc

[18:33:42] <revnoah> aheckmann: I didn't realize that. I'll join that one for mongoose-specific questions

[18:33:50] <aheckmann> revnoah: thanks

[18:34:07] <aheckmann> revnoah: "primary author" is kind of vague

[18:34:23] <aheckmann> revnoah: a lot of the current api is from guillermo rauch

[18:34:24] <revnoah> aheckmann: yup, sure is. :)

[18:34:37] <aheckmann> revnoah: but i've been maintaining for over a year

[18:40:43] <revnoah> aheckmann: excellent work thus far. it's taking some work to get my head around it, but it's very cool stuff. Old dogs can learn new tricks, it just takes longer :)

[18:41:33] <aheckmann> revnoah: woof! :)

[19:17:56] <vguerra> hello there

[19:18:25] <vguerra> I was wondering if there are any mongo users hanging around here from Vienna, Austria ?

[19:26:54] <jstout24> question: so if we had tons of writes (lets say a tracking table for instance)… if a user hits a page and clicks a link… the first page will write the impression and the link needs to do a lookup for the impression then update a field on it… would we have a problem doing this?

[19:29:49] <jstout24> currently, we're inserting, then when a user clicks, we're doing a find on the object id, then update.

[19:43:27] <amontalenti> jstout24, you probably want to look at modifier operations (http://www.mongodb.org/display/DOCS/Updating#Updating-ModifierOperations) and findAndModify (http://www.mongodb.org/display/DOCS/findAndModify+Command)

[19:49:41] <kenneth> hey--we're seeing a bunch of errors on the php driver with "couldn't determine master". what's up with that, it seems random and we can't figure it out!

[19:50:18] <ninja_p> hey kenneth, what is your setup?

[19:50:33] <kenneth> seeing this on a shared + replicated cluster

[19:50:41] <ninja_p> mod_php, fpm ?

[19:50:48] <kenneth> 4 shards, each shard has repl set of two secondaries + one master

[19:50:51] <kenneth> mod_php

[19:51:31] <kenneth> connecting with persistent, setting slaveok

[19:51:46] <ninja_p> authenticating?

[19:51:50] <kenneth> no auth

[19:52:02] <kenneth> (firewalled from the rest of the world)

[19:52:13] <kenneth> what's the official way to connect to a sharded cluster btw, is it the same as a replset?

[19:52:32] <ninja_p> i dont know tbh, we only run replset

[19:52:42] <ninja_p> we see the issue you mention as well though

[19:52:51] <ninja_p> im just trying to find a bugreport for you

[19:53:43] <ninja_p> what version of the php driver you using btw?

[19:53:48] <ninja_p> https://jira.mongodb.org/browse/PHP-329

[19:53:58] <ninja_p> ^^ that may be of use to you

[19:59:36] <Bartzy> Anyone from 10gen here?

[19:59:49] <Bartzy> Sent a consulting request, wondering if it can be answered in the next 2 hours

[20:00:25] <sl00> My documents contain an array (plots) of objects with the syntax {type:<integer>,color:<integer>} and now I want to find all documents that have an object {plots.type:1,plots.color:0xffffff} in that array. How can I do that without also getting {plots.type:2,plots.color:0xffffff} ?

[20:03:52] <bjori> Bartzy: meaning a new ticket on an existing contract, or completely new thing?

[20:04:08] <bjori> either way really, most are responded to very quickly

[20:04:34] <bjori> Bartzy: if you have a critical case.. call :)

[20:05:46] <Bartzy> bjori: It's a new thing, I sent it via the web form of consulting. We need to get some consulting on our hardware spec before ordering it

[20:05:55] <Bartzy> Is that something "worth" doing with the paid support?

[20:06:01] <Bartzy> Or can we just get some quick answers here? :)

[20:08:36] <bjori> Bartzy: the simple answer is "lots and lots of ram, and a raid-10 disk setup"

[20:08:47] <bjori> the complicated answer depends on your dataset and how you access it

[20:11:04] <bjori> Bartzy: http://www.mongodb.org/display/DOCS/Production+Notes#ProductionNotes-WhatHardware%3F :)

[20:11:20] <bjori> Bartzy: (see also the child pages)

[20:12:20] <Bartzy> bjori: S/W RAID 10 or H/W ?

[20:12:27] <Bartzy> Because we're getting 4 x 256GB Samsung 830 SSDs

[20:12:55] <Bartzy> and I guess it doesn't need more IO then a single SSD, and RAID10 would not help a lot in performance, so that's just to make them a JBOD... I gues ?

[20:13:20] <Bartzy> bjori: So we were wondering, hardware RAID, or mdadm.. with respect to a mongo setup

[20:14:11] <Bartzy> Also, what about CPU? We're taking dual E5-2620 so we'll be able to upgrade to lots of RAM (The motherboard only has 4 DIMMs per CPU socket, so if we take only 1 CPU, we get 16GB x 4 which is 64GB... Not so much for upgrades, and 32GB DIMMs are very costly)

[20:16:28] <timing> hi all!

[20:19:56] <Bartzy> bjori: ? :|

[20:24:18] <timing> in a webapp running on mongo, do you guys just pass the objectId arround in the get parms? or do you generate numeric id's and put an index on it?

[20:24:57] <timing> I think they are rather long for some of my collections

[20:25:11] <tystr> we're just using MongoIds

[20:26:05] <bjori> Bartzy: you are asking questions I cannot answer :)

[20:26:20] <tystr> URL's arent' as "pretty" (e.g. item/edit/4ffb37d38833221d44425e89), but meh

[20:26:20] <timing> ok, just because you can? or is there a reason I don't know about?

[20:26:29] <Bartzy> bjori: Who can? ;)

[20:26:31] <timing> tystr: yes indeed

[20:27:40] <bjori> BruNeX: twitter ;)

[20:27:42] <bjori> blah

[20:27:46] <bjori> Bartzy: twitter :)

[20:28:00] <Bartzy> bjori: Twitter @10gen ?

[20:28:12] <bjori> Bartzy: no no, just #mongodb :P

[20:28:57] <Bartzy> ok :)

[20:28:57] <bjori> Bartzy: I have a hard time imagining someone from 10gen will actually bill you for tips on hardware

[20:29:25] <Bartzy> bjori: Well, if I get to their support by the paid consulting way.. how they won't ?

[20:29:26] <bjori> Bartzy: but like the doc say.. we aren't hardware specialists.. and everything depends on your data and access pattern

[20:29:35] <bjori> there is no solution-fits-the-all

[20:29:42] <Bartzy> I wish someone from 10gen would just pop here for 5 minutes for quick answers about my questions :)

[20:30:12] <Bartzy> And then for the schema consulting and more in depth consulting of our setup, we'd get the paid support after installing the server and migrating the data and doing our own testing :)

[20:30:23] <bjori> but like I said, there isn't a definitive answer. it all depends

[20:30:31] <bjori> how much data are you expecting ?

[20:30:38] <bjori> how much traffic to the database?

[20:30:45] <bjori> are you going to run several shards?

[20:31:03] <Bartzy> bjori: Between 100 and 200GB now, growing fast... 100+ million documents, growing at a pace of 500,000 per day

[20:31:32] <Bartzy> bjori: Don't know about traffic. Why does it matter if the network is enough for it ?

[20:31:52] <Bartzy> I don't want to run several shards at the beginning. We want to scale vertically as much as possible on reasonable price. 64GB RAM for now.

[20:32:15] <Bartzy> We're just looking for: "Most of our setups using SSDs are using THIS type, and THIS raid configuration"

[20:32:33] <Bartzy> or "You really don't want that 12 core dual CPU setup", or the opposite :)

[20:33:48] <bjori> Bartzy: sounds like you first want schema suggestions and then recommendation on a hardware :)

[20:34:04] <bjori> all I can give you right now is the general suggestions :)

[20:35:12] <jstout24> on mongodb site: http://dl.dropbox.com/u/65317585/Screenshots/5a58.png

[20:35:22] <jstout24> heh

[20:35:31] <tystr> as opposed to non-memory use of your memory

[20:36:00] <Bartzy> bjori: Yep, but we need to order the boxes now. so...

[20:36:18] <bjori> Bartzy: see private :)

[20:42:26] <dstorrs_> is there a way to exclude a collection from replication?

[20:43:00] <dstorrs> I've got a handful of collections I use for my job manager, and they don't need to be replicated. It would just spam the oplog.

[20:49:41] <bjori> dstorrs: use the local database? :)

[20:50:08] <dstorrs> oh. is that what that's for?

[20:50:18] <dstorrs> Anything in local is "do not replicate"?

[20:50:21] <bjori> no, but it can be used for all sorts of cool things :)

[20:50:31] <bjori> anything in local is local to that server

[20:51:00] <dstorrs> ...

[20:51:14] <dstorrs> what's the difference between "local to that database" and "do not replicate" ?

[20:52:12] <bjori> dstorrs: for your specific case, nothing

[20:52:19] <dstorrs> Ok.

[20:52:41] <dstorrs> I'm gonna run and grab lunch, then I'll RTFM about the 'local' DB. Thanks for the pointer.

[20:52:49] <bjori> :)

[20:53:08] <jonwage_> does anyone know why the default to MongoCursor::count() is false? http://us.php.net/manual/en/mongocursor.count.php

[20:53:33] <jonwage_> it seems error prone, if someone does this $cursor = $collection->find()->limit(5); echo count($cursor); // expect I would get 5

[20:53:41] <jonwage_> but i get the total count for the cursor without limit

[20:54:08] <jonwage_> i can work with this, it just seems unintuitive and error prone

[20:56:15] <timing> Back again :-)

[20:56:23] <kchodorow> jonwage_: backwards compatibility

[20:56:38] <jonwage_> kchodorow: alright, we got majorly bit by this

[20:56:44] <kchodorow> the db didn't used to take limit & skip info for count

[20:56:48] <bjori> jonwage_: it doesn't implement countable does it?

[20:56:53] <jonwage_> we were just doing a php count() in templates

[20:57:05] <jonwage_> let me check

[20:57:19] <timing> If I want to store a very large array in a collection and I don't want to query on it. IS it better to use serialize the object as a string or can I just put the array into the collection as is

[20:57:33] <timing> ( I know I can, but is it wise )

[20:57:38] <bjori> jonwage_: nope. it doesn't implement it :)

[20:58:11] <bjori> jonwage_: seems like a trivial fix though

[20:59:03] <bjori> jonwage_: wait. re-reading your question.. your example and question don't match..

[20:59:06] <bjori> :)

[20:59:11] <jonwage_> ya I'm dbl checking my stuff now

[21:00:32] <jonwage_> bjori: hmm I'm not sure what is going on yet, when i call echo count($cursor) right now it is still calling $cursor->count()

[21:01:08] <jonwage_> bjori: i was dbl checking that maybe it was my library on top of mongo pecl extn that implemented countable

[21:01:18] <jonwage_> but i only implement Iterator like the base MongoCursor does

[21:05:17] <bjori> jonwage_: hmh... count($cursor) only returns 1 for me

[21:05:31] <bjori> jonwage_: and I don't see how it could return anything else since it doesn't implement the countable interface

[21:06:29] <jonwage_> ok one more sec

[21:06:35] <jonwage_> possible it is still something in my library

[21:06:45] <bjori> doctrine?

[21:06:49] <jonwage_> yes

[21:07:36] <jonwage_> weird, i only implement Iterator in my class that wraps MongoCursor

[21:07:46] <jonwage_> but for some reason count() is working on it

[21:07:46] <jonwage_> a

[21:07:51] <jonwage_> and calling ->count()

[21:07:54] <jonwage_> i must be missing something

[21:08:24] <jonwage_> ahh

[21:08:26] <jonwage_> im an idiot

[21:09:04] <bjori> http://pastebin.com/caipi7RL

[21:09:13] <bjori> jonwage_: you found the problel? :)

[21:09:27] <jonwage_> yup i implement Countable :)

[21:09:32] <jonwage_> so tis' my fault

[21:09:44] <bjori> :)

[21:10:25] <jonwage_> now to decide what change to make

[21:11:37] <jonwage_> i think it was a mistake making my Cursor class implement Countable

[21:13:51] <bjori> jonwage_: I'm sure we can have MongoCursor implement Countable by tomorrow for you if you want (:

[21:14:15] <jonwage_> bjori: hmm, well i don't necessarily "want" it, it's just now it is already there

[21:14:27] <jonwage_> i want count($cursor) to equal $cursor->count(true)

[21:14:34] <jonwage_> but i want to keep $cursor->count(false) as the default

[21:14:35] <jonwage_> to

[21:14:35] <jonwage_> m

[21:14:36] <jonwage_> aint

[21:14:37] <jonwage_> to maintain BC

[21:14:52] <jonwage_> but i can't do that :(

[21:15:22] <jonwage_> in our application sometimes cursor get created and passed through to the tempting layer

[21:15:35] <jonwage_> where we may do things like {% if users|length > 0 %}

[21:15:36] <jonwage_> w

[21:15:42] <jonwage_> which results in $users->count(false) :/

[21:15:58] <jonwage_> so i am thinking I'm gonna add a listener for all view params and convert cursors to an array

[21:16:12] <jonwage_> i don't wanna break BC in Doctrine

[21:16:17] <jonwage_> I'm not sure :/

[21:17:21] <bjori> hmh

[21:17:22] <jonwage_> and i don't wanna change $foundOnly default to true, because i wanna keep it inline with the MongoCursor API

[21:18:03] <bjori> it is to late here for me to give any sensible feedback on your thoughts =)

[21:18:12] <jonwage_> please do

[21:22:44] <jonwage_> bjori: certainly :)

[21:26:14] <kchodorow> bjori: what? what?

[21:26:50] <Derick> hm? :)

[21:27:08] <jonwage_> kchodorow: Derick so this ended up being my fault, I have a Doctrine\MongoDB\Cursor that implements countable

[21:27:12] <jonwage_> and it calls MongoCursor::count()

[21:27:27] <jonwage_> so doing $cursor = $collection->find()->limit(5); echo count($cursor); gives me results > 5

[21:27:28] <Derick> i've only seen half of the conversation

[21:27:45] <jonwage_> don't worry about it, it was my fault

[21:28:15] <jonwage_> stop with all the hitting!!!

[21:28:15] <jonwage_> :)

[21:28:30] <kchodorow> he started it :-P

[21:28:35] <jonwage_> :)

[21:31:43] <bjori> kchodorow: sorry.. thought jonwage_ was right for a second there and was gonna blame you for some weird intercept magic :)

[21:31:52] <jonwage_> :)

[21:32:27] <dstorrs> kchodorow: <voice = "maternal"> and if he started jumping off a bridge, would you do THAT too, hmmmmmmmm? </voice>

[21:33:03] <dstorrs> kickban coming in 3...2...1....

[21:33:04] <kchodorow> damn

[21:33:05] <dstorrs> :>

[21:33:22] <dstorrs> (on me, that is)

[21:34:28] <bjori> its not gonna work! I have your ip address!! moahaha!

[21:34:47] <dstorrs> oh yeah?!

[21:34:50] <dstorrs> well....

[21:34:51] <dstorrs> um...

[21:34:56] <dstorrs> ok.

[21:46:21] <nimijr> So when upgrading to a sharded server, does the data auto migrate between the two servers?

[21:50:56] <bjori> nemothekid: I don't see why it wouldn't

[21:51:07] <bjori> nemothekid: it depends on your shard keys thought ofcourse

[21:51:30] <nemothekid> bjori: I just don't see any data on the second database, the database wasn't even created

[21:51:48] <nemothekid> bjori: using a show collections on the shared collection

[21:51:52] <nemothekid> *show dbs

[21:52:30] <bjori> nemothekid: how did you initialize the second server?

[21:52:42] <bjori> nemothekid: and are you sure there should be any data on the second server?

[21:53:11] <bjori> nemothekid: you should be able to see the migrations from your server A if you check the logs carefully enough

[21:58:34] <nemothekid> bjori: From the logs alone I can't tell. I see accepted connections from my config box. Is there any other way I can tell the existing data is already being sharded. Thanks for your help btw

[21:59:58] <nemothekid> nvm figured it out

[22:01:33] <nemothekid> bjori: one of our config servers were set to PDT

Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 9th of July, 2012