pmxbot IRC Log Viewer

[00:11:50] <multi_io> is there a way to atomically create a document ONLY IF it doesn't exist yet (as determined by a query for it returning nothing)?

[00:12:17] <multi_io> ...but if the query does return something, don't modify the document?

[00:15:24] <richardraseley> I have a question related to sharding in MongoDB. So, I understand that sharding is done at the collection level, does that mean that any fields within any documents within that collection could be on any shard in the MongoDB infrastructure?

[00:16:19] <richardraseley> I didn't know if the document was the most granular unit that could be sharded or if fields within the same document could live in different shards?

[00:17:06] <multi_io> richardraseley: the document is the most granular unit.

[00:17:56] <richardraseley> So, the name of the document is the primary attribute that determines the placement within the parition?

[00:18:51] <richardraseley> Or is there a specific field within the document that is keyed off from?

[00:21:07] <richardraseley> multi_io: So, to clarify - a document can never be spread across multiple chunks

[00:21:25] <richardraseley> And therefore can never exist in multiple shards?

[00:31:26] <richardraseley> so the mongos process acts as a transaction coordinator / proxy for user requests - does that service have to query the config service for each query to determine where the chunks exist in the shards, or do they maintain knowledge of the chunk / shard configuration?

[00:31:54] <multi_io> richardraseley: you choose the field. look for "sharding key" in the docs.

[00:32:11] <richardraseley> multi_io: I see - thank you.

[00:38:31] <therealkasey> anyone have a pointer to startupStatus code meanings? I'm trying to debug a replica set where the nodes don't seem to see each other.

[00:57:09] <jstout24> is it good or bad practice to have multiple databases per application?

[01:24:56] <therealkasey> hmm, i figured out my replication issue, but i'm not quite sure what to do about it. i have a replicaset config that has gone out to all nodes with an incorrect hostname (the host that originally was the primary). the primary demoted itself to secondary, so i can't reconfigure the set from there. so i've got a set with no primary and a bad config on all nodes.

[01:27:52] <therealkasey> thinking about dropping my oplog and replset collections

[01:51:11] <kotedo> To all the MongoDB guru's out there: Can I have something like a many to many relationship in MongoDB?

[01:59:11] <alnewkirk> kotedo: sure, why not?

[02:00:16] <kotedo> alnewkirk: I think I am just brain damaged from too much SQL lately

[02:00:37] <alnewkirk> thats entirely possible

[02:00:47] <kotedo> any pointers?

[02:01:21] <alnewkirk> kotedo: table-a <- relationship-table -> table-b

[02:01:29] <webjoe> what is your use case for a many-to-many relationship?

[02:01:38] <kotedo> users and groups

[02:01:40] <alnewkirk> kotedo: see mongodb.org and dbref

[02:01:55] <kotedo> a user can be a member of many groups and ...

[02:02:01] <webjoe> kotedo: you can just embed users in groups. denormalize

[02:02:15] <kotedo> that's what I did

[02:02:19] <kotedo> :)

[02:02:31] <kotedo> It just felt so ... DIFFERENT ..

[02:02:44] <alnewkirk> err ... depends

[02:02:55] <kotedo> I embedded the users in my group

[02:03:10] <alnewkirk> webjoe: fetch list of groups?

[02:03:19] <alnewkirk> ... using that design

[02:03:35] <alnewkirk> ... on millions of user records

[02:03:42] <alnewkirk> # fail

[02:03:46] <kotedo> :)

[02:03:51] <kotedo> I know ...

[02:04:02] <webjoe> no

[02:04:06] <webjoe> groups, has many users.

[02:04:16] <webjoe> each group has embedded array of users

[02:04:17] <kotedo> users has many groups

[02:04:27] <webjoe> you can have millions of groups (each group = 1 doc)

[02:04:34] <kotedo> webjoe: right

[02:04:38] <webjoe> each user has many groups

[02:04:46] <webjoe> embed array of groups in each user

[02:04:55] <webjoe> each user = 1 doc

[02:04:58] <webjoe> you have two collections:

[02:05:01] <webjoe> users, groups

[02:05:04] <kotedo> right

[02:05:07] <webjoe> they know about each other

[02:05:23] <webjoe> if you want to get complicated (and cute), what happens a user is deleted? what happens if a group is deleted? etc

[02:05:27] <alnewkirk> webjoe: exactly, thats NOT denormalizing, thats many-to-many

[02:05:31] <webjoe> do it later in a batch

[02:05:35] <alnewkirk> which is not what you said at first

[02:05:48] <webjoe> it is denormalized - you have redundant data in each other table

[02:05:51] <webjoe> no many-to-many table

[02:05:58] <webjoe> sorry collection.

[02:06:55] <kotedo> so, maybe I should have a second database for millions of user and groups in a SQL fashion and the rest lives in MongoDB?

[02:07:15] <therealkasey> i would have an indexed field for groups in each user, where groups is a list of simple strings, then an indexed name attribute in the group

[02:07:41] <alnewkirk> there is no redundant data, and that design is very normal

[02:08:00] <webjoe> i'm assuming groups has more metadata then a string

[02:08:15] <webjoe> i mean if you want to jam it all into a nested array, don't' let me stop you.

[02:08:20] <kotedo> it will have more meta data, yes

[02:08:30] <kotedo> urgh

[02:08:58] <kotedo> my framework supports a mix of databases

[02:09:05] <webjoe> kotedo - just go with whatever feels right, remember it's mongo, you're not committed to any structure

[02:09:06] <webjoe> you can fix it later

[02:09:20] <webjoe> why?

[02:09:25] <kotedo> maybe really going SQL on groups and users and all the other needed data is in mongo?

[02:09:28] <webjoe> are you building a framework for other uses?

[02:09:38] <webjoe> (shrug)

[02:09:44] <kotedo> no

[02:10:03] <kotedo> I am using a framework that supports multiple databases at once

[02:10:04] <alnewkirk> whats the significance of also using SQL?

[02:10:10] <webjoe> yea?

[02:10:17] <webjoe> (sort of bias, you're in a mongodb room)

[02:10:26] <kotedo> maybe just to keep that part "simple"

[02:10:33] <webjoe> really? ok.

[02:11:02] <kotedo> yeah, I know, and I am sorry, I am just trying to find the right tool for the job

[02:11:09] <kotedo> and I love mongodb

[02:11:31] <kotedo> mongodb and erlang ... = big love

[02:11:41] <webjoe> just do it. :)

[02:11:42] <alnewkirk> simply use dbref (with some metadata as webjoe recommended) approach

[02:11:52] <kotedo> I'll look into that

[02:12:19] <alnewkirk> nosql doesn't focus on referential intergrity, thats left to your app code

[02:12:27] <kotedo> true

[02:28:56] <Kane`> does this schema look like the right fit for my queries?

[02:29:01] <Kane`> http://codepad.org/Jsa1LdaC

[02:30:20] <dstorrs> what are the 3 ISODates in 'timestamps'?

[02:30:41] <dstorrs> will there always be exactly 3, or is that a rolling log / constantly growing log ?

[02:31:29] <dstorrs> otherwise, yes, your schema looks good, but your indexes don't. you need multi-key indices in order to support those queries.

[02:32:16] <Kane`> dstorrs, there will be more than three. 10 or more per node per client, probably

[02:32:36] <Kane`> the 3 ISOdates are, well, ISOdates ._.

[02:32:38] <dstorrs> but it's a finite, limited number? not an ever-expanding list?

[02:32:56] <Kane`> it's ever-expanding up until the end of the day

[02:33:08] <dstorrs> at which point it gets wiped and restored?

[02:33:18] <dstorrs> if so you're ok. But you don't want it to grow forever

[02:33:37] <dstorrs> *get wiped and starts growing again tomorrow?

[02:33:38] <Kane`> it doesn't get wiped. it just stops being written to

[02:33:48] <dstorrs> ok, good enoguh.

[02:33:51] <Kane`> a new document will be made, the next day, to get logged to

[02:34:29] <dstorrs> as to the indices, try something like this: { node : -1, client : 1 } => "has [node] logged [client]?"

[02:35:03] <dstorrs> you want to put the higher-selectivity ones first, so node before timestamp before client

[02:35:40] <dstorrs> well, depending on what "10 or more timestamps" really means

[02:36:19] <Kane`> sorry, does that mean i should run my queries like: db.log.find({node: n, client: c}) as opposed to db.log.find({client: c, node: n}) ?

[02:36:52] <Kane`> dstorrs, "10 or more" is more of an average. some clients may only be logged be each node once. in that case, there'll only be one timestamp

[02:36:56] <Kane`> for the most part though, there will be many

[02:37:01] <dstorrs> the order you put them in the 'find' doesn't matter, but the order in the index matters a lot.

[02:37:29] <dstorrs> You want to build your index with the most selective item (i.e., the one with the fewest elements) on the left

[02:38:13] <dstorrs> check out this article on indexes: http://kylebanker.com/blog/2010/09/21/the-joy-of-mongodb-indexes/

[02:39:18] <Kane`> ah, that makes sense

[02:51:39] <dstorrs> what exactly is the difference between $pull and $pullAll in update()? they seem to do the same thing.

[02:53:41] <dstorrs> oh wait, I see. $pullAll takes an array and removes everything that matches any element of it.

[05:42:21] <Kage`> Anyone know of any PHP+MongoDB forums systems?

[06:28:55] <sagarchalise> Hi this may be wrong question to ask but are there any resource on mongodb database samples for people who have only worked on RDBMS

[06:39:01] <ukd1> sagarchalise, what kind of samples

[06:41:13] <sagarchalise> ukd1: I am having hard time designing mongo schema. I am thinking in mysql point of way of tables and relations.

[06:41:40] <sagarchalise> ukd1: thinking about a mapping sample of mysql database to mongo db design just for reference

[06:45:00] <ukd1> sagarchalise, there are examples - I'll google for you

[06:45:16] <ukd1> - this might be handy : http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart

[06:45:55] <sagarchalise> ukd1: I am looking into those but they talk about queries.

[06:46:00] <ukd1> yea,

[06:46:17] <ukd1> http://www.mongodb.org/display/DOCS/Schema+Design

[06:46:24] <ukd1> http://www.mongodb.org/display/DOCS/Schema+Design#SchemaDesign-Videos - videos

[06:46:31] <sagarchalise> ukd1: what I am having problem with is say I have 10 tables on sql then would I go on creating 10 collections on mongo

[06:46:32] <ukd1> there are good examples in some of the videos

[06:47:07] <ukd1> you could basically copy the layout from MySQL - but it's usually not the best way of doing it in Mongo, but will work.

[06:47:21] <ukd1> watch some of those vids

[06:48:12] <sagarchalise> ukd1: ok thanks

[06:55:22] <heoa> Is there some command to see everything in DB?

[06:59:22] <heoa> Is there some command to quickly see examples/manual about things such as db.test.find(...)?

[06:59:34] <heoa> db.help(db.test.find) is not very informative...

[07:38:45] <[AD]Turbo> hola

[07:40:41] <Kane`> what happens if i try to write to a document that's already hit the 16mb limit?

[07:40:51] <Kane`> will mongodb silently ignore my write? or will errors be thrown/things break?

[07:41:37] <maik_> hi all

[07:41:45] <maik_> Anyone an ETA for jira to wake up? :-)

[07:51:43] <skot> maik_: ec2 is still bringing things up so we need to wait for EBS volumes to checkout

[07:52:18] <skot> Kane`: an error will be logged and returned to the client if they do safe writes

[07:52:59] <Kane`> good to know, skot. cheers

[07:56:14] <maik_> thx skot

[08:48:58] <spillere> i'm doing z = db.users.find(), I want to get all data from the db, when I print z, it gives <pymongo.cursor.Cursor object at 0xb0d120ec>

[08:49:10] <spillere> how can I display all data properly?

[08:51:21] <skot1> maik_: back up now

[08:51:32] <algernon> spillere: iterate over z.

[08:51:38] <carsten> fetch all data first - a cursor is a cursor and not the complete result set

[08:53:24] <spillere> ty

[08:53:30] <spillere> i guess ill get it working :)

[09:17:38] <multi_io> can a "find or initialize" sort of functionality be implemented atomically without locks?

[09:18:03] <NodeX> like an upsert ?

[09:18:07] <multi_io> i.e., is there a way to atomically create a document ONLY IF it doesn't exist yet (as determined by a query for it returning nothing)?

[09:18:28] <NodeX> an upsert!

[09:18:33] <multi_io> NodeX: if it already exists, I don't want to change it

[09:18:44] <multi_io> which an upsert does, I think...

[09:18:51] <NodeX> craft a sneaky upsert then

[09:18:57] <NodeX> that would not update anythign!

[09:19:31] <NodeX> or you can put a unique index on the collection and insert on the constraint

[09:22:53] <multi_io> hm

[09:23:41] <spillere> where can I get a mongodb sticker?

[09:25:50] <multi_io> NodeX: problem with "craft a sneaky upsert that would not update anything" is that, if the document already exists, I don't know its contents. So I can't craft that upsert...

[09:26:03] <NodeX> then add a unique index

[09:26:22] <multi_io> (without first fetching the document, which would mean two operations, so it would no longer be atomic...)

[09:26:29] <multi_io> NodeX: yeah

[09:26:38] <multi_io> thanks, I'll look into that

[09:29:53] <NodeX> A normal constraint might be email on users or somehting

[09:46:35] <millun> hi, can i ask how to EnsureIndexes with native query syntax?

[09:50:48] <carsten> please?

[09:52:20] <millun> i am rewriting a part of code to work with BasicDBObjects as advised. it is going ok but i don't know how to issue "ensureIndexes" command

[09:53:06] <carsten> it is our business to guess framework and implementation language?

[09:53:28] <millun> java, spring. sorry

[09:53:45] <carsten> c# here

[09:53:48] <carsten> read your api docs

[09:53:53] <millun> okay

[10:33:00] <NodeX> $pull

[10:33:03] <NodeX> oops

[13:26:53] <rhqq> hey, how to check if replica is healthy?

[13:27:29] <carsten> http://www.mongodb.org/display/DOCS/Replica+Set+Commands

[13:28:43] <NodeX> send it to the docs :P !!

[13:29:00] <rhqq> which field will ensure me that i can switch master w/ replica and there will be no problems with it

[13:29:56] <rhqq> health that will be 0 for server down and 1 for server up tells me nothing...

[13:33:09] <rhqq> did you guys write something? :P

[13:38:01] <rhqq> my replica is having extremely high cpu utilization

[14:02:01] <rhqq> how can i check what is it doing? :P

[14:03:55] <carsten> by reading the cited docs

[16:20:57] <bingomanatee> can you guys see me?

[16:21:08] <UForgotten> bingomanatee: yes

[16:29:51] <rhqq> how big time-gap can be between nodes to make sure that replica will catch up to master?

[16:30:21] <rhqq> im creating new replica out of yesterdays backup

[16:31:10] <rhqq> and im wondering if it will catch up to master :P

[17:30:49] <multi_io> is it possible to have explicitly "shard-local" collections?

[17:33:06] <DPP> how does one give a sole surviving replicaset member permissions to be primary?

[17:33:13] <multi_io> i.e. a collecion whose entire set of documents is different on each shard.

[17:33:33] <DPP> "replSet can't see a majority, will not try to elect self"

[17:33:34] <multi_io> (or is that the case with each non-sharded-collection automatically...hm)

[18:00:46] <kchodorow> DPP: http://www.mongodb.org/display/DOCS/Reconfiguring+a+replica+set+when+members+are+down

[18:03:58] <mitsuhiko> hey guys

[18:04:13] <mitsuhiko> does the startup order for mongos and mongo controllers matter? is mongodb supposed to start up properly if the services go up in random order?

[18:04:26] <mitsuhiko> i noticed mongos being stuck if we reboot the whole cluster at once for testing purposes

[18:09:55] <zirpu> did you restart mongos ?

[18:10:09] <mitsuhiko> zirpu: why don't they do that themselves

[18:10:11] <mitsuhiko> or at least die

[18:10:11] <zirpu> anything in the logs from mongos?

[18:10:17] <mitsuhiko> zirpu: yes, that's not the point

[18:10:20] <mitsuhiko> i know that a restart fixes it

[18:10:25] <mitsuhiko> the question is: why is there a zombie mongos left

[18:10:35] <mitsuhiko> why does it not either a) retry itself or b) just die with an error

[18:10:35] <zirpu> might be worth opening a bug i guess.

[18:10:50] <mitsuhiko> is nobody using this database in production? :-/

[18:11:02] <zirpu> yes, but they're asleep atm. :-)

[18:28:47] <mikerobi> is it possible to get a javascript stack trace for exceptions in db.eval calls?

[18:29:50] <tystr> I'm getting segmentation faults with the php MongoDB driver...

[18:29:54] <tystr> looks related to GridFS

[18:30:41] <tystr> https://gist.github.com/fc907cf626933a7e2f33

[18:48:35] <adamt> If anybody sees the auther of the Haskell-bindings, do let him know that there's a problem with the dependencies in the cabal-files for both version 1.2 and 1.3 of the bindings.

[18:50:24] <ranman> will do

[20:16:33] <zonetti> does anyone know how to save html in a Mongo's String field?

[20:17:20] <Derick> you'd do it just like any other string

[20:17:25] <Derick> just make sure it's UTF-8

[20:19:03] <zonetti> Derander_, I'm trying PHP requesting a nodejs API through CURL... I already tried htmlentities, utf8_encode...

[20:19:10] <zonetti> Derick, , I'm trying PHP requesting a nodejs API through CURL... I already tried htmlentities, utf8_encode...

[20:19:42] <zonetti> but it still doesn't work... =/

[20:21:12] <Derick> "doesn't work" isn't a very good starting point for offering help

[20:22:17] <Zelest> the opposit to "works" .. duuh!

[20:22:22] <Zelest> </trolling> :-)

[20:24:41] <dstorrs> I'm trying to do a match that returns only one embedded document and having trouble. Example is here: http://pastie.org/4094338

[20:25:08] <zonetti> Derick, I solved using base64 :O

[20:25:16] <dstorrs> I thought I could make this work with the positional document, but no joy (maybe just PEBKAC error, though)

[20:25:24] <zonetti> I mean, "I solved"

[20:25:26] <dstorrs> *positional operator

[20:29:56] <dstorrs> I tried this > db.temp.find({ 'pages.owner' : 'worker_9'}, { 'pages.$':1}) but it returns { "_id" : "carol", "pages" : [ { }, { } ] }, when what I actually want is { num:1, owner:'worker_8'}

[20:30:19] <dstorrs> (less the typo on the worker num)

[20:30:23] <dstorrs> any thoughts?

[20:33:51] <zonetti> dstorrs, your issue is actually a mongo issue, limitation...

[20:34:22] <jstout24> anyone know any good schema design for event tracking?

[20:34:35] <zonetti> some post in stackoverflow gives the link to mongodb repository when they say they will implement find() in embedded docs...

[20:34:36] <dstorrs> zonetti: ah, ok. So, I have to return the main document, although I can restrict to the particular top-level fields I care about?

[20:34:41] <jstout24> i'm trying to figure out the best indexes to use

[20:35:12] <zonetti> dstorrs, in the app I'm developing, when I need to do that, I find like you did and then iterate over the results =/

[20:35:25] <zonetti> dstorrs, I guess is the only way right now

[20:35:34] <zonetti> this*

[20:36:01] <dstorrs> Thanks for telling me -- at least this way I can stop looking for an answer

[20:38:00] <multi_io> is ObjectId more efficient that using simple numbers as the id?

[20:38:03] <multi_io> *than

[20:38:17] <mitsuhiko> multi_io: why would you use a number?

[20:38:24] <mitsuhiko> multi_io: objectid solves a completely different problem

[20:38:29] <mitsuhiko> multi_io: how would you allocate numbers?

[20:38:42] <mitsuhiko> no, objectid is not efficient, that's not what it tries to do

[20:39:38] <multi_io> mitsuhiko: allocate using a separate "counter" or maybe using an optimistic strategy

[20:40:02] <dstorrs> multi_io: that's going to break HARD on a cluster

[20:40:03] <multi_io> the mongodb docs at one point say that you should use natural id if applicable, iirc.

[20:40:04] <mitsuhiko> multi_io: that's the reason why numbers are not used in distributed systems

[20:40:09] <mitsuhiko> because you would need a central counting server

[20:40:24] <mitsuhiko> multi_io: i recommend against object ids though. use uuids

[20:41:20] <multi_io> dstorrs: the point is, we need that unique integer anyway (it's required by the use case), so it's a "natural" key, not an artificial one

[20:41:43] <dstorrs> You asked, we answered. Your call.

[20:42:00] <multi_io> I was thinking of making the counter values globally unique using only information that's available locally on the shard

[20:42:24] <dstorrs> *blink*

[20:42:38] <multi_io> something like value = shardNr + shardCount*(local counter value)

[20:42:38] <dstorrs> that's a pretty good trick, if those are really going to be just numbers

[20:42:59] <multi_io> (if that's possible, I'm not sure yet)

[20:44:03] <mitsuhiko> multi_io: so why not use an objectid then?

[20:44:07] <mitsuhiko> or a uuid

[20:44:11] <dstorrs> ooooooh! I just discovered covered indexes. Shiny!

[20:44:27] <mitsuhiko> >>> uuid.uuid4().int

[20:44:27] <mitsuhiko> 13340449400021276730959379030318756414L

[20:44:29] <mitsuhiko> there. integer

[20:44:44] <mitsuhiko> :)

[20:44:44] <Derick> a bit one too

[20:44:53] <Derick> big*

[20:46:01] <mitsuhiko> well, uuid's are large ;)

[20:46:06] <multi_io> the customer requires that the numbers be roughly incrementing values, starting from 1. They are supposed to be exposed to end users :P

[20:46:15] <mitsuhiko> multi_io: stupid requirement

[20:46:29] <mitsuhiko> multi_io: you can use uuids and expose them to customers as wlel

[20:46:43] <mitsuhiko> base62 encode them

[20:47:23] <multi_io> mitsuhiko: why do you recommend uuids over objectids?

[20:47:53] <mitsuhiko> multi_io: well supported format in general, entirely random if you use uuid4, perfect for sharding

[20:48:20] <multi_io> ok, so there is no special performance benefit to objectids?

[20:48:48] <multi_io> why does mongodb not use uuids to begin with?

[20:48:52] <mitsuhiko> multi_io: i don't like objectid's because a) specific to mongodb, b) 32bit timestamp, c) not entirely random

[20:49:02] <mitsuhiko> multi_io: legacy i suppose and because objectid's are orderable

[20:49:11] <mitsuhiko> but if i want ordering i would not use mongodb, i would use postgres

[20:49:30] <mitsuhiko> the only reason to use mongodb is if you want sharding. otherwise it's just a pain

[20:52:34] <mitsuhiko> if *anyone* has any ideas how to reliably run mongos, please let me know

[20:52:49] <mitsuhiko> even with the latest version these things are still incredible unreliable

[20:53:31] <tilleps> keeps crashing?

[20:53:58] <multi_io> mitsuhiko: you would prefer Postgres over MongoDB for any use case as long as it doesn't involve sharding?

[20:54:13] <mitsuhiko> tilleps: don't even come up

[20:54:24] <mitsuhiko> tilleps: if they don't find the mongo config servers they become headless zombies

[20:54:37] <mitsuhiko> tilleps: they keep retrying but never manage to find them, if you restart them they can

[20:54:43] <mitsuhiko> multi_io: god yes

[20:55:31] <mitsuhiko> multi_io: mongodb is great for my particular usecase, don't get me wrong, but it's very unreliable still and it needs someone to sit next to keeping it running

[20:55:57] <mitsuhiko> and if you run it with less than three controller servers i just don't trust it yet

[20:56:15] <multi_io> that's kind of a minority opinion I guess :P I can't imagine that it would be so hard to implement app-specific sharding yourself in Postgres

[20:56:50] <mitsuhiko> multi_io: do you have to shard?

[20:57:03] <multi_io> mitsuhiko: no, not yet

[20:57:11] <mitsuhiko> will you ever have to shard

[20:57:22] <multi_io> I hope I will sometime next year :)

[20:58:05] <multi_io> i.e. I hope the system will become big and successful enough, haha

[20:58:16] <mitsuhiko> multi_io: i think the pain is just not worth it

[20:58:34] <mitsuhiko> multi_io: from non sharding to sharding in mongodb is very painful

[20:59:01] <mitsuhiko> at least from my humble experience

[20:59:13] <mitsuhiko> bbl

[20:59:42] <richardraseley> Just trying to clarify some points with regard to MongoDB sharding a replication. So the smallest component that will be sharded is the document. It is sharded according to the shard key which is selected and must be unique to all documents. multiple documents with contiguous shard keys will exist in a chunk and chunks are divided across shards as needed? Is that correct?

[21:00:12] <multi_io> mitsuhiko: so you wouldn't use mongodb because it's too unreliable, not because it doesn't provide any more features than, say, Postgres.

[21:04:42] <tilleps> how to calculate how much space mongoconfig servers take?

[21:05:54] <multi_io> richardraseley: the shard key doesn't have to be unique, I don't think.

[21:06:04] <multi_io> wouldn't make much sense

[21:08:39] <richardraseley> multi_io: Hmm, I thought that was the case - but assuming your right, can you speak to the rest of my assumptions?

[21:10:11] <richardraseley> multi_io: Also, can you comment on whether or not the mongo routing service has to talk to the mongo config service for every query, or if it maintains a copy of the shard configuration locally as well?

[21:13:01] <richardraseley> multi_io: Just confirming that you are correct with regard to shard key uniqueness (per http://docs.mongodb.org/manual/core/sharding/#sharding-shard-key)

[21:13:13] <jstout24> i'm trying to think of the best schema design for an events passing an object like, `db.events.insert({ name: 'impression', parameters: { page: 'some_page_id', layout: 'some_layout_id', ……., visitor: 'some_visitor_id' });

[21:13:26] <jstout24> i was thinking about turning the parameters into a key / value pair and do a multi-index on that

[21:13:33] <jstout24> but i'm not sure how to query upon given parameters

[21:14:47] <multi_io> richardraseley: I guess it keeps it in memory, but I don't know, sorry. I haven't done sharding in production :)

[21:16:38] <richardraseley> Can anyone else confirm my assumptions in the comment above? Documents being the most granular unit that can be sharded, being gathered in ranges in "chunks" and spread across shards as needed?

[21:16:44] <richardraseley> multi_io: that's ok - thanks.

[21:33:01] <jstout24> if i have a document with field "data" which has key / value pairs… how can i search all documents where `(data.k = 'foo_k' & data.v = 'foo_v') AND (data.k = 'bar_d' & data.v = 'bar_v')`

[21:51:52] <infinitiguy> hello

[21:51:55] <infinitiguy> anyone use rest?

[21:52:07] <infinitiguy> I added rest = true in my mongod.conf but port 28017 didn't come up

[21:52:15] <infinitiguy> I also restarted mongo

[21:59:48] <mitsuhiko> multi_io: mongodb without replication is unreliable. mongodb with replication is a very complex setup compared to postgres

[22:00:15] <mitsuhiko> on top of that you don't have transaction safety or the expressiveness of sql

[22:01:24] <mitsuhiko> richardraseley: yes, that's how it works

[22:01:47] <richardraseley> mitsuhiko: Thank you for your response.

[22:02:23] <mitsuhiko> richardraseley: "Also, can you comment on whether or not the mongo routing service has to talk to the mongo config service for every query" <- talks to the config service yes

[22:02:27] <mitsuhiko> which is why you want three of them

[22:02:41] <richardraseley> So it talks to it with every query?

[22:03:07] <richardraseley> Client Request -> Router -> Config to determine placement -> to appropriate shard -> back to client?

[22:03:24] <mitsuhiko> richardraseley: yes

[22:03:40] <richardraseley> Also, with regard to scalability write performance - it is safe to say that you are limited in write performance on a single document to that maximum performance of the master node in the shard that owns the chunk in which it lives?

[22:04:25] <mitsuhiko> richardraseley: actually, it does not need to talk to the config all the time

[22:04:35] <mitsuhiko> mongos will cache it for some time

[22:04:46] <mitsuhiko> for as long as the shard setup does not change i think

[22:05:10] <richardraseley> mitsuhiko: OK, thank you.

[22:05:18] <mitsuhiko> see also: http://docs.mongodb.org/manual/faq/sharding/#when-do-the-mongos-servers-detect-config-server-changes

[22:05:50] <richardraseley> Ah, thank you.

[22:06:06] <mitsuhiko> richardraseley: generally it's quite worry free when it's running

[22:06:23] <mitsuhiko> the getting it running part is hard :)

[23:29:02] <mkmkmk> i have a 2.0.6 mongos that keeps dying on startup, "received signal 11"

[23:29:07] <mkmkmk> with a backtrace

[23:54:39] <dstorrs> ok, this has to be an ID10T error on my part but I don't see it. count_pages = function(v) { print(v.pages.length) }; count_pages.apply({pages : [] }) => 'TypeError: v has no properties (shell):1'

[23:54:43] <dstorrs> what am I missing?

[23:56:36] <mitsuhiko> dstorrs: first argument to apply is a object pointer

[23:56:41] <mitsuhiko> (which is bound to this)

[23:56:54] <mitsuhiko> i think you want count_pages.call(null, {pages: []})

[23:57:34] <dstorrs> right. Yes, thank you. That was exactly it.

[23:58:02] <mitsuhiko> mkmkmk: got a logfile?

[23:58:34] <mkmkmk> https://jira.mongodb.org/browse/SERVER-6110

Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 15th of June, 2012