PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 13th of March, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:02:28] <joshua> unholycrab: I think you need to have them seperated cause if one goes down and you have multiple config servers there they all go down and it defeats the purpose. It might work ok sharing the resources with a mongod node but you can also use a really lightweight machine. Config servers don't need as much resources
[00:11:11] <valonia-> So i have 2x columns, one named level and one named maxlevel
[00:11:27] <valonia-> If i wanna remove the one that level == maxlevel
[00:11:32] <valonia-> How do i do that?
[00:11:37] <valonia-> In PHP
[00:13:46] <valonia-> In MySQL it would just be like: DELETE FROM accounts WHERE level = maxlevel
[00:18:49] <ruphos_> valonia-: http://docs.mongodb.org/manual/reference/method/db.collection.remove/
[00:20:02] <valonia-> ruphos: that isnt very helpful :(
[00:20:11] <joshua> http://docs.mongodb.org/manual/reference/operator/query/where/
[00:20:50] <ruphos> it is precisely the mongo equivalent to delete
[00:21:09] <valonia-> joshua: so you are sugestion that i would use a find and then remove the ones i found?
[00:21:47] <joshua> Never done it with PHP, but heres the remove manual for that http://www.php.net/manual/en/mongocollection.remove.php
[00:22:41] <joshua> I don't know how the where operator works in that case, you might be able to do the two in one command
[00:24:34] <joshua> remove syntax says the first option is query, so on the mongodb shell at least you should be able to pass the same query to remove as you do to find
[00:36:44] <Zitter> Hi, Ive found this collection http://media.mongodb.org/zips.json I'm new to mongo, is there a tut focused on that data?
[00:37:49] <joshua> Zitter: http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/
[00:39:52] <Zitter> thanks joshua
[02:59:30] <rafaelhbarros> 17min to run a map reduce
[02:59:37] <rafaelhbarros> how can I fix this?
[02:59:52] <rafaelhbarros> the data structure is this
[03:00:39] <rafaelhbarros> {'object_number': "hex_digest", 'events':['event1', 'event2']}
[03:00:45] <rafaelhbarros> I'm doing emit on each event
[03:00:56] <rafaelhbarros> and the reduce just sums 1 for each key
[03:02:34] <joannac> what's the ket?
[03:15:41] <rafaelhbarros> joannac: KET?
[03:18:01] <joannac> key*
[04:43:29] <fluter> hi,
[04:43:40] <fluter> what's the difference of pymongo and python-pymongo,
[04:43:45] <fluter> for python driver use
[04:43:56] <fluter> I see both package exists in the repo
[04:44:46] <fluter> anyone?
[05:09:25] <joannac> fluter: which repo?
[05:09:44] <fluter> joannac, I'm running centos, and the packages are in epel repo
[05:12:11] <joannac> I'd say python-pymongo since it's a higher version?
[05:12:49] <fluter> joannac, that's my guess too, but they both exists, that means one can install both
[05:13:01] <fluter> they should be different packages?
[05:14:04] <joannac> they shouldn't, it probably just got a name change
[05:15:11] <fluter> ok,
[05:15:16] <fluter> that's better then
[05:15:30] <fluter> so I should use python-pymongo, as its newer version, right?
[06:44:06] <Rahar> Hi everybody! I need an advice for db choice/design. Say I have millions of user ids and billions of post ids. I need to store whether user have seen a post or not and later query that data. The query is something like "have this used seen this post?". The updates are done very frequently (each time user reads a post). Any suggestions to use mongo in this case or to go with other type of db?
[07:26:09] <Gr1> Hi everyone
[07:26:21] <Gr1> I have 3 boxes for sharding
[07:26:29] <Gr1> and I enabled sharding at the db level
[07:26:40] <Gr1> { "_id" : "ekl_omniscient", "partitioned" : true, "primary" : "shard0000" }
[07:26:52] <Gr1> But when I try to enable sharding at the collection level,
[07:27:03] <Gr1> sh.shardCollection("ekl_omniscient.serviceability", { "_id": "hashed" } )
[07:27:19] <Gr1> I am getting the following error. "errmsg" : "exception: nextSafe(): { $err: \"not master and slaveOk=false\", code: 13435 }"
[07:27:34] <Gr1> I am doing it from mongos
[07:36:21] <txt23> I have a CSV file like this http://pastebin.com/4Uy3wgK8 which has 1st row has column names and rest is data. How can I import it into mongodb via mongoimport? Can someone please guide me?
[07:37:02] <iwantoski> if my documents defines a type as a string, i.e. { type: 'Answer' }, { type: 'Question' }, { type: 'Other' }, can I by the type field order like so: Question, Answer, Other? Basically specify what type (string) goes in what priority?
[07:37:10] <txt23> I know technically it will be --type csv --file /opt/backups/contacts.csv but how will column name work?
[07:37:43] <iwantoski> txt23: Did you try it?
[07:37:52] <txt23> iwantoski: Not yet
[07:38:02] <iwantoski> txt23: Try it and see :)
[07:38:11] <txt23> iwantoski: Okay. Lets see
[07:39:56] <ElephantHunter> Derick joannac Number6 - The interactive tutorial on the site can not be completed, due to the server returning errors when modifying collections. There's no readily available site admin contact information... so I'm hoping one of you would be able to pass this on to whomever manages it.
[07:42:15] <txt23> iwantoski: Worked!
[07:42:20] <txt23> iwantoski: Thanks!
[07:53:16] <joannac> ElephantHunter: try.mongodb.com ?
[07:53:53] <ElephantHunter> jeannac yes
[07:54:00] <ElephantHunter> *joannac
[07:54:50] <joannac> ElephantHunter: thanks, I'll file a ticket
[07:55:23] <ElephantHunter> joannac oh wait, no... try.mongodb.org
[07:55:39] <joannac> ElephantHunter: yeah it's okay, I found it :)
[07:57:21] <Gr1> Any help guys?
[07:58:40] <Froobly> i have lists of items. i'm using mongoose. when i save an item it contains the list id, but the list does not update to contain the item id. is there a way of doing this in mongoose?
[08:00:32] <joannac> Gr1: you have a primary for every shard?
[08:02:34] <daniel-s_> Hi
[08:02:45] <daniel-s_> I'm using pymongo. What is the best way to store the time of something.
[08:03:13] <daniel-s_> Should I just use a field named "time" or something similar, then save the time as a floating point number?
[08:03:22] <daniel-s_> i.e., the output of time.time() ?
[08:07:58] <Froobly> if it's not possible, what would be an elegant solution?
[08:50:57] <jacksmith> I have to store 4 GB of key-value pair of data, i want faster access to the data, i have 1.6 GB of RAM on my server will it beneficial if i migrate to mongoDB ?
[08:55:03] <Froobly> i'm using mongoose. i have lists of items. when i save an item it contains the list id, but the list does not update to contain the item id. is there a way of doing this in mongoose? if not, what is an elegant solution?
[09:01:37] <Nodex> jacksmith : if your indexes fit in RAM yes
[09:02:41] <jacksmith> Nodex: i tried using redid-server but i could not afford that much of RAM
[09:12:30] <kamol> how to limit amount of result in runCommand ?
[09:13:04] <ron> kamol: if you don't mind me asking, what's your native language?
[09:13:27] <kamol> :ron sorry, not english
[09:13:43] <ron> kamol: yes, i know. it's okay, I was just curious.
[09:13:58] <kamol> ron: Uzbek
[09:13:59] <ron> you're in the most expensive place in the world!
[09:14:11] <ron> oh.. your ip is in singapore :)
[09:14:26] <kamol> :ron how did you find my IP?
[09:14:57] <ron> kamol: it's in your whois info
[09:15:52] <kamol> WHOIS ron ?
[09:16:01] <ron> yeah
[09:16:24] <kamol> it doesn't shows your IP, strange
[09:17:21] <ron> yes, because I hide it.
[09:20:41] <kamol> :ron I will try to improve my question...
[09:21:31] <kamol> I am using db.runCommand(), however, I don't know how to limit amount of rows in returned result.
[09:23:47] <kamol> sorry, ignore my question, I got the answer from the doc
[09:30:10] <alexgr> any morphia gurus here?
[09:30:43] <ron> alexgr: cheeser is the best morphia guru there is
[09:30:47] <ron> buthe's probably asleep
[09:30:51] <ron> but he's*
[09:30:54] <ron> try asking though
[09:30:59] <alexgr> cheeser ping
[09:31:09] <alexgr> can i use @Reference List<String> usernames where User entity has Id and userName as index? I want to have the username readily available without hitting the db
[09:31:12] <ron> dude, really, he's asleep :)
[09:32:13] <alexgr> i want to make a document that stores the usernames of the users (which are unique and indexed) but i don't need their id and i don't want to fetch them while displaying the referencing document , only their usernames
[09:32:17] <alexgr> a bit denormalized
[09:52:52] <Gr1> joannac: Sorry mate, I was away. You there?
[09:53:55] <Gr1> I have only one primary and 2 secondary
[10:51:43] <BaNzounet> Hey guys, If I do emails.email will it check all elements of the array or will it only check the first one?
[10:52:16] <Derick> in a query? all
[10:53:07] <BaNzounet> yep with .find(), thanks Derick :)
[10:53:56] <romaric_> Hi all, we have an issue with the php driver after we've sharded our mongodb database. Previously, we was connecting to mongo like this: $mongo = new Mongo($dsn, $params); $db = $mongo->selectDb($dbName); And we was getting the connection id like this: $lastError = $db->getLastError(); $conId = $lastError['connectionId']; Once the db has been been sharded there is no more connectionId. Is there someone who know what is the good
[10:53:56] <romaric_> practice to get the connection id ?
[10:54:30] <Derick> hm
[10:54:36] <Derick> romaric_: please use "MongoClient"
[10:54:40] <Derick> instead of "Mongo"
[10:55:00] <Derick> but getLastError returns the result right from the server (mongos)
[10:55:01] <romaric_> yep ok, but i was looking at the doc and i didn't see anywhere a way to get the connection id
[10:55:09] <Derick> so not sure whether you can get to the connection_id anymore
[10:55:13] <Derick> why do you need it?
[10:55:14] <romaric_> arf
[10:55:19] <romaric_> actually
[10:55:31] <romaric_> it is the get the status of a running mapreduce
[10:55:50] <Derick> but M/R will likely run on more than one server
[10:56:00] <romaric_> each client has its own connection id
[10:56:04] <romaric_> indeed...
[10:56:53] <romaric_> but when i do a db.currentOp(opid) i get a progress object with done & total
[10:57:05] <romaric_> i can't get this anymore when collection is sharded ?
[10:57:30] <Derick> did you try running that against mongos?
[10:58:05] <romaric_> no, I'll do
[10:58:25] <romaric_> do you know a way to get a status of a running MR ?
[10:58:54] <Derick> no, sorry :-/
[10:58:57] <romaric_> oki
[10:59:00] <Derick> I haven't used M/R in a long while
[10:59:14] <romaric_> thanks for help, i'll try to keep you up to date
[10:59:34] <romaric_> do you use aggregation fwk only ?
[10:59:39] <Derick> yes
[10:59:49] <Derick> I'll be boarding a plane soon though.
[11:00:25] <romaric_> humm
[11:00:39] <romaric_> thanks you for the help
[11:12:08] <gingerninja> Hi
[11:14:00] <gingerninja> Do queries execute remotely?
[11:14:23] <Nodex> eh?
[11:14:30] <gingerninja> On the db instance or local machine?
[11:14:41] <gingerninja> (Client)
[11:14:42] <Nodex> serious?
[11:14:52] <gingerninja> Yes
[11:18:39] <gingerninja> Helpful.
[11:20:28] <Nodex> you're asking a stupid question. How on earth do you expect a client to perform a query with no data?
[11:23:10] <gingerninja> Nodex someone told me that the whole collection is transferred to the client and then the query for a specific document is then searched for on the client sidr
[11:23:25] <gingerninja> Smells fishy to me
[11:23:51] <Nodex> whoever told you that doens't have a clue about MongoDb
[11:24:08] <gingerninja> Hehe good.
[11:28:01] <gingerninja> An architect lol
[11:30:46] <gingerninja> Nodex thanks gtg
[11:43:27] <kali> oO
[12:21:50] <iwantoski> Is it possible to specify a projection of a subdocument?
[12:33:10] <groundup> I have a collection, named likes, that holds all of the likes that relate to several other things - places, events, and guides. Each document is something like this: {place: 'id', user: 'id'} or {guide: 'id', user: 'id'}. There is always a user but the other field can be non-existant.
[12:34:14] <groundup> So, I added a sparse, unique index for each type. ensureIndex({place:1, user:1}, {unique:1, sparse:1}) ensureIndex({guide:1, user:1}, {unique:1, sparse:1})
[12:34:44] <groundup> I am getting duplicate key errors when I add a new like for a different place. The guide index is the one that's complaining.
[12:44:17] <kali> i'm not sure sparse will work as expected with multiple key index
[12:45:24] <kali> i think the "sparse" logic says "if key is null, don't index". but with an hybrid key, the key itself is not null, it's one of its composants which is null
[12:46:25] <kali> groundup: that's just speculation, but it could explain what you're experiencing :)
[12:50:09] <Gr1> Hi guys
[12:50:54] <groundup> kali, http://docs.mongodb.org/manual/core/index-sparse/ "You can specify a sparse and unique index, that rejects documents that have duplicate values for a field, but allows multiple documents that omit that key."
[12:51:12] <Gr1> I have a sharded cluster of 3 machines. Now, if I need to make it a replica set, do I need extra machines, or is it Ok to use these 3 machines itself as replica set?
[12:51:50] <Gr1> And if I make it to a replica, there can be only one primary right?
[12:52:59] <groundup> hmmm... https://jira.mongodb.org/browse/SERVER-2193
[12:55:56] <groundup> Ugh, this just made my life very complicated.
[12:59:05] <kAworu> hi, is this expected: https://gist.github.com/kAworu/9527549 ?
[12:59:39] <kAworu> basically I want a unique index on a subdocument's property, but when I use the $push operator I can create a duplicate.
[13:02:47] <kali> groundup: one way would be to add two composite fields yourself: place_and_user_if_both_not_null, guide_and_user_if_both_not_null and have unique sparse index on them
[13:02:57] <kAworu> I found nothing on the Indexes FAQ
[13:03:58] <kali> kAworu: unique index must be understood as "at most one document in the collection has this value"
[13:04:36] <kAworu> kali: thank you, I just found out this link http://grokbase.com/t/gg/mongodb-user/1252d9dg96/push-with-unique-indexes in which the answer concure with you.
[13:05:50] <groundup> I think I am going to change it to something like this: user, type, id, value. Then type would be one of [place, event, guide, review]
[13:06:36] <groundup> Then remove the sparse index. Add the unique index as {user, type, id}.
[13:13:45] <cheeser> alexgr: pong?
[13:38:52] <coreyfinley> Has anyone experienced and issue where adding a query to a map/reduce seems to process more data than running the same map/reduce against the entire collection?
[13:40:04] <cheeser> um. how could that happen?
[14:02:39] <starfly> Gr1: you can use the 3 systems, assuming their performance capacity is adequate. Yes, one primary exists in a replica set, so with 3 shards at present, there will be 3 primaries inside 3 replica sets.
[14:24:03] <Gr1> Thanks starfly
[14:24:19] <Gr1> One more. So once shard and replica is enabled,
[14:24:32] <Gr1> mongos would be able to identify which is master and which is slave right?
[14:25:10] <starfly> correct, Gr1 (which are the primaries and which are the secondaries--more accurate terminology).
[14:26:42] <starfly> Gr1: one replica set is needed for each of your shards
[14:27:05] <Gr1> I see.. Thank you starfly :)
[14:27:11] <starfly> np :)
[14:32:46] <ekristen> what is the recommended way to backup mongodb, when you are dealing with 100s of GB of data?
[14:33:01] <ekristen> right now I’m at like 180gb and with mongodump it takes 3 hours
[14:34:19] <Joeskyyy> Point in time snapshots can work.
[14:34:38] <Joeskyyy> I typically set all my stuff in an LVM and take an LVM snapshot
[14:34:57] <Joeskyyy> That way if I need to restore anything from a backup, I have the actual data files to restore from
[14:35:10] <KamZou> Hi, on this command : db.runCommand( { shardcollection : "stats.mycollection", key : {"_id":1} }) could you please explain to me what do : "id":1 ?
[14:36:46] <Joeskyyy> That specifies "id" as your shardkey
[14:36:54] <Joeskyyy> err, rather _id
[14:40:19] <KamZou> Joeskyyy, so it specifies the whole _id
[14:40:41] <Joeskyyy> KamZou: No, that is telling mongo what to shard your collection on
[14:40:45] <Joeskyyy> I'd recommend reading this: http://docs.mongodb.org/manual/core/sharding-shard-key/#shard-key
[14:41:05] <KamZou> Mmm ok
[14:41:42] <KamZou> The problem is : i've sharded the full _id collection (with 5-6 elements inside)
[14:42:11] <KamZou> i wanted only one ... Is there a way to modify that after the sharding began ?
[14:42:26] <Joeskyyy> I don't get what you're trying to say...?
[14:42:36] <KamZou> hmmm let me rephrase
[14:43:13] <KamZou> inside my collection : "mycollection" i've a field "_id"
[14:43:50] <KamZou> and in _id i've elements like d (date), c(other thind), z(other thing)
[14:43:53] <KamZou> do you follow me ?
[14:44:04] <Joeskyyy> So far yes.
[14:44:08] <KamZou> with the command i type : db.runCommand( { shardcollection : "stats.mycollection", key : {"_id":1} })
[14:44:18] <KamZou> i put the sharding on every "elements" of _id
[14:44:29] <KamZou> not only on _id.d for instance
[14:44:34] <KamZou> it's a mistake
[14:44:44] <KamZou> is there a way to fix this now ?
[14:44:48] <Joeskyyy> You'd need to reshard your collection
[14:44:53] <Joeskyyy> But sharding on a date is a bad idea
[14:44:55] <Joeskyyy> badddd bad idea
[14:45:00] <KamZou> why ?
[14:45:09] <Joeskyyy> Read the link I sent for more clarify, but your data won't balance
[14:45:29] <Joeskyyy> You'll have older dates which contain WAY more documents in their chunk than newer dates
[14:45:32] <KamZou> Joeskyyy, in that specific case we don't need to balance
[14:45:35] <Joeskyyy> so your cardinality will be terrible
[14:45:39] <Joeskyyy> Then why are you sharding?
[14:45:57] <Joeskyyy> If you're not balancing, sharding is pretty much silly
[14:46:14] <Joeskyyy> the point of sharding is to have relatively even data chunks to query to
[14:47:12] <KamZou> Joeskyyy, cause we've not any disk space soon (the first shard is ssd for recent data, the second one is for olders chunks with HDD)
[14:47:31] <KamZou> *we'll run out of space soon
[14:49:16] <Joeskyyy> Gotcha, well typically you can do a dump on that collection, drop it, then reimport it
[14:49:21] <Joeskyyy> If you're going to reshard it
[14:49:47] <Joeskyyy> I still don't think sharding is the best solution you're looking for… but it's your database
[14:50:22] <KamZou> Joeskyyy, i'm open to suggestions if you have some for this particular case :)
[14:51:44] <Joeskyyy> Do you actually access the older data?
[14:52:06] <Joeskyyy> Or is it possible to archive it?
[14:52:23] <starfly> it depends on the profile of writes and reads to old and new data
[14:53:12] <KamZou> Joeskyyy, yes our customers access to older data
[14:57:06] <step3profit> Can anyone point me at an example of consuming the oplog in python? I am using pymongo to connect. I have tried using a tailable cursor, and not using one... in any case, it seems to pause for about 20-40 seconds, then return 100-300 records, then pause again.
[14:57:27] <step3profit> I'm not using any timestamp or anything in the find at this time
[14:57:29] <Joeskyyy> It really depends on the access patterns KamZou.
[15:00:29] <KamZou> Joeskyyy, ok. Is this a big issue to set a shard key on "full" _id instead of _id.d in my case ?
[15:01:37] <Froobly> Is it possible to use an update to add a value to an array? or am i going about this the wrong way?
[15:02:46] <Joeskyyy> KamZou: That should sparse it out pretty neatly
[15:03:02] <Joeskyyy> But functionality wise it won't make sense if you're not querying on _id
[15:03:03] <Nodex> Froobly : check out $push and or $addToSet
[15:03:14] <Froobly> thanks Nodex
[15:03:27] <Nodex> $addToSet will add it to an array if it doesn't exist
[15:06:50] <barry1> Hi there, would anyone be able to help me with a doctrine2 mongodb query?
[15:07:21] <Nodex> we can help with the query as it would be on the cmd line
[15:09:30] <barry1> ah right. So not about polymorphic embedded documents in doctrine2?
[15:11:42] <Nodex> if you choose to use some ORM crap that's your choice, we can help with the structure as it would be in a json style builder
[15:12:30] <ekristen> Joeskyyy: so you just take LVM snapshots and call it a day? do you back up those off to another server?
[15:13:24] <barry1> yes I dont need help with that, the mongodb documentation is pretty good. thanks.
[15:14:15] <Nodex> lol
[15:14:24] <Joeskyyy> ekristen: Yeah, I push them up to some backup up location. Then i can always mount them and restore
[15:14:24] <Nodex> doctrine fanboy
[15:14:43] <Joeskyyy> Nodex: He so mad bro
[15:16:42] <Nodex> yup, what I don't understand is all the queries end up as something you must be able to do on the shell so what's the problem with working in a format that most people know
[15:17:03] <Nodex> the polyorphic part is given by the language and not by the ORM anyway
[15:27:56] <coreyfinley> When running my map reduce with a query, it aggregates 32 records with the same key, however when I run the same map reduce on the full collection, w/o a query, it only finds 4 records for the same key.
[15:32:37] <alexgr> i want to make a document that stores the usernames of the users (which are unique and indexed) but i don't need their id and i don't want to fetch them while displaying the referencing document , only their usernames
[15:32:39] <alexgr> a bit denormalized
[15:33:21] <alexgr> @reference List<string> with only the usernames (indexed) but without the id or part of the object (username,id) is possible with morphia?
[15:35:03] <cheeser> it isn't. morphia will pull the entire referenced object.
[15:36:29] <alexgr> but @reference only keeps refs to the id's according to documentation no?
[15:37:54] <cheeser> if you use @Reference morphia will store a DBRef (by default) in mongodb
[15:38:25] <cheeser> but when loading the containing object, that reference will be used to load the other object and that's done in its entirety
[15:39:34] <alexgr> so i can only have this functionality if i make the userName the default key for the collection... is it bad practice if i don't have an objectid but a unique string that i set?
[15:40:19] <cheeser> the id can be whatever you want it to be. ObjectId is just the default the database uses.
[15:40:45] <cheeser> but even using a string as the id, you can't do what you want. morphia will use that string to load the entire object
[15:41:13] <alexgr> can't i just print this array without fetching the objects?
[15:41:25] <cheeser> the list of references? no.
[15:41:37] <alexgr> i want when i fetch the document to just have an array {smith,alister,john} and display it
[15:41:38] <cheeser> that's not what @Reference is for
[15:41:44] <alexgr> without querying anything
[15:42:13] <cheeser> you'd have to store those values directly on an object. but those aren't @References as such
[15:42:59] <alexgr> hm i see maybe it's better to have a normal array and just do crud manually there
[15:43:53] <cheeser> in this case for sure.
[15:44:08] <alexgr> thanks
[15:44:50] <cheeser> np
[15:45:34] <trupheenix> anyone know why I cannot run my PYMONGO script on crontab?
[15:45:59] <cheeser> probably no actual environment set up for it.
[15:47:40] <trupheenix> cheeser, ok so how would i set up the env for it?
[15:47:58] <cheeser> i'm not a pymongerer
[15:48:21] <trupheenix> cheeser, hmm wonder if someone can help me on this
[15:48:34] <anildigi_> https://news.ycombinator.com/item?id=7391366 — Midas – On-the-fly schema migration tool for MongoDB Released
[15:50:19] <anildigi_> Do checkout the tool
[15:50:41] <anildigi_> @Derick @joannac @Number6 ^
[16:04:12] <KamZou> Joeskyyy, about the discussion we had few minutes ago
[16:04:18] <KamZou> You said me : <Joeskyyy> Gotcha, well typically you can do a dump on that collection, drop it, then reimport it
[16:05:09] <KamZou> Cause i've any data on the second shard right know, could i stop insertions, stop sharding and reshard the collections with the correct key ?
[16:05:21] <KamZou> instead of dropping and recreating ?
[16:06:04] <Joeskyyy> No, because it's already been sharded. Mongo will complain
[16:06:19] <Joeskyyy> … I think. Nodex?
[16:19:04] <ekristen> if I am adding a new member to a replicaset and I have a full backup using mongodump, whats the right way to restore it into mongo so that the initial sync is small?
[16:19:39] <Joeskyyy> mongoimport it before adding it to the repl set
[16:19:47] <Joeskyyy> then add it to the replset
[16:20:07] <kali> from a mongodump ? i don't think that will work
[16:20:59] <Joeskyyy> If it's just a mongodump of the db, no?
[16:21:29] <kali> nope, you can't seed a replica from a mongodump
[16:21:36] <kali> you need a snapshot of the disk
[16:21:53] <Joeskyyy> huh.. interesting
[16:21:59] <kali> because there is no information in the mongodump about "when" the snapshot was done
[16:23:01] <Joeskyyy> what about if you use the —oplog flag in the mongodump?
[16:24:22] <ekristen> ok, I’ll snapshot the disk, this’ll be fun :/
[16:24:52] <ekristen> I wonder if it’d be easier to just let mongo sync 160gb at this point
[16:25:55] <kali> Joeskyyy: i did not know that exists, but even so... http://docs.mongodb.org/manual/tutorial/resync-replica-set-member/#copy-the-data-files
[16:26:27] <Joeskyyy> yeah just reading that as well
[16:26:35] <kali> ekristen: if your system is not overloaded, just letting the automatic sync works damn well
[16:26:42] <Joeskyyy> huh… interesting.
[16:26:45] <ekristen> kali: I just wonder how long it’ll take
[16:26:50] <Joeskyyy> +1 to that though
[16:26:51] <ekristen> or if I should just rsync the files
[16:27:58] <kali> ekristen: honestly, just try the automatic way... it's so easy... if you're getting into trouble, stop the syncing node and try something else
[16:28:06] <ekristen> kk
[16:28:52] <kali> you're aware rsync would require to stop the node your rsyncing from ?
[16:29:14] <ekristen> wasn’t positive, but had a feeling
[16:30:18] <kali> good. just try the easy way :)
[16:32:35] <Joeskyyy> kali: Thanks for pointing that out though haha. I've always just done it the easy way too and never done an attempt to put preliminary data first
[16:32:36] <Joeskyyy> :D
[16:34:25] <receptor> is mongodb date object vulnerable to year 2038 bug?
[16:34:36] <kali> receptor: nope
[16:34:49] <starfly> planning ahead?
[16:34:57] <receptor> yeah
[16:35:23] <starfly> LOL, I thought Y2K planning a year out was considered almost overkill
[16:35:42] <cheeser> um.
[16:35:57] <cheeser> that would've been foolishly late to the party.
[16:37:09] <starfly> Yeah, but after everyone in the tech company avoided most issues, most people were like 'huh, what was the fuss, no problems resulted"
[16:37:45] <starfly> community
[16:38:04] <kali> yeah
[16:38:16] <Joeskyyy> i had my water bottles and cheetos
[16:38:24] <kali> aha :)
[16:40:53] <draggie> I have a dev environment with mongodb installed on it. My code checks to see if a particular document in a particular collection exists. If it does, then it returns it. Otherwise, it inserts the document into the database. I drop the collection containing that document from the database and then run the program. 2 out of 3 times, it still finds the document even though I've already told mongo to drop it via the mongo shell. Is there
[16:40:53] <draggie> a delay on mongo to drop collections? I am not replicating or sharding at all.
[16:43:08] <bcpk> is there a "this" equivalent in update operators?
[16:43:27] <cheeser> why would there need to be?
[16:43:29] <bcpk> i would like to use $rename to do something like { this : this.toLowerCase() }
[16:43:38] <cheeser> you can't do that, no.
[16:43:45] <bcpk> how should I proceed?
[16:44:08] <bcpk> I just want to lowercase all keys in my objects
[16:44:15] <cheeser> write js in the shell or a small app in ${language} to iterate and update your docs
[17:27:54] <ekristen> kali, Joeskyyy — syncing the new member is goign to take a while and I have 3 to add :/
[17:29:56] <unholycrab> is there any reason i shouldn't have more than 3 config servers?
[17:30:55] <unholycrab> like, running a config server on every persist instance?
[17:31:54] <Joeskyyy> You can only have 1 or 3 (1 for dev, 3 for production)
[17:40:36] <starfly> unholycrab: even if you wanted more than 3, presumably 3 for prod was chosen as the sweet spot between the need to maintain multiple copies of critical sharding maps and the overhead of what's used to keep them in sync, two phase commit
[18:02:47] <royiv> Hi all; we're trying to configure MMS, but MMS and/or the the MMS agent keeps overriding the hostname I give it with the hostname of the machine.
[18:03:29] <royiv> The monitoring agent appears to have been re-written in Go; the old one made this mistake, but you could just remove the broken hosts and re-add them.
[18:03:56] <royiv> However, with this agent, every time I add a host, it removes it and re-adds `hostname`:27017.
[18:04:20] <royiv> Is there any way to get around this? (Should I downgrade to the older Python agent?)
[18:28:17] <unholycrab> nice, starfly
[18:28:29] <unholycrab> thanks
[18:28:58] <unholycrab> starfly: is this 3 config servers per shard? or 3 config servers accross all shards
[18:29:06] <unholycrab> ie: per cluster
[18:36:21] <starfly> unholycrab: 3 config servers total per production sharded cluster
[18:37:09] <unholycrab> thanks starfly
[18:37:15] <starfly> sure, np
[19:03:05] <bobbytek1> Hey everyone
[19:03:18] <bobbytek1> Any idea why mongod wouldn't be using all available memory?
[19:03:39] <bobbytek1> I have a 128G RAM node and it is only taking 27%
[19:03:48] <bobbytek1> Nothing else is running on this box
[19:04:05] <unholycrab> starfly: what kind of downtime should i expect when enabling sharding at the database level?
[19:04:24] <unholycrab> given a single shard, and no sharded collections/shard keys yet
[19:04:27] <unholycrab> should it be instantaneous?
[19:04:44] <bobbytek1> Here are some stats on the box: http://pastie.org/private/owlx4jhil5hsqsv9u7d0ww
[19:08:26] <rickibalboa> bobbytek1, how big is your working set? It probably doesn't need to
[19:09:10] <kali> first time somebody complains about mongodb not using enough RAM :)
[19:09:19] <bobbytek1> :)
[19:09:21] <Zelest> Haha
[19:09:23] <bobbytek1> One sec
[19:10:51] <bobbytek1> Output of db.stats()?
[19:11:03] <bobbytek1> http://pastie.org/private/glpsizcor0r9mlpg0zb2wq
[19:12:20] <kali> storageSize: ~ 12GB
[19:13:05] <bobbytek1> Does that include all indexes?
[19:13:08] <kali> yeah
[19:13:13] <bobbytek1> I thought it is also bigger in memory
[19:13:15] <kali> and padding and all
[19:13:38] <bobbytek1> hmm
[19:13:47] <Guest84073> In an aggregation pipeline, how do I go about returning the value of a document that has the minimum date?
[19:13:53] <bobbytek1> So I guess I'm looking good there then
[19:13:58] <kali> yep
[19:14:04] <bobbytek1> Trying to increase the read performance
[19:14:11] <bobbytek1> I would have thought it would be quicker
[19:14:27] <bobbytek1> kali: Thanks for taking a look :)
[19:14:36] <kali> bobbytek1: not by throwing more memory at it
[19:14:49] <bobbytek1> Assuming I have the correct indexes setup, how can I optimize cursor reads?
[19:15:16] <Wil> For example, I find all documents that match some value. Then I want to return a specific field in that document where the doucment has a value that is some minimum. Does that make sense?
[19:17:11] <starfly> unholycrab: there's setup time for config servers and mongos routers, but as far as prod outage to enable sharding, I believe it's minimal, but haven't taken a large unsharded MongoDB database and done that. The heavy work occurs in the background when sharding large collections, but that (of course) could impact production performance, depending upon how well or over-provisioned your production environment is.
[19:17:46] <Wil> Get doucments that match X, and in the document which has the minimum value of ABC, return the value Y.
[19:19:41] <Wil> Let's say I have a document: { name: smith, age: 3 }... {name: bob, age: 6}... I want to select the minimum document by age (3), and return the name (smith)
[19:19:54] <bobbytek1> I have a basic query that searches for all documents with a field that contains a certain value. I then stream through the cursor client side using the Java driver. How can I make this performant as possible for a single node setup?
[19:22:33] <unholycrab> awesome, starfly
[19:23:03] <Wil> Ah hrm
[19:23:38] <unholycrab> im guessing the most disruptive phases of converting to a sharded cluster are creating the indexes, and the initial balancing out of the documents accross new shards
[19:29:09] <starfly> unholycrab: new index generation can of course be a big hit whether sharding or not, so yes, additional indexing needed for formerly unindexed shard keys will be a load. Not sure disruptive is the right term, but guess if your collection is large enough, that could apply. You can do things like pre-split documents across new shards to minimize automatic chunk movement by the balancer later, etc. to minimize transition to a fully-balanced shard
[19:29:10] <starfly> configuration.
[19:30:36] <unholycrab> starfly: awesome. pre-split chunking may be something to look at
[19:31:17] <starfly> unholycrab: yes, and meant to say, pre-split empty collections.
[19:31:50] <unholycrab> yeah i just read that. empty collections doesn't help me much
[19:33:17] <starfly> unholycrab: depends upon your use, splitting chunks is expensive, so pre-splitting can help
[19:34:00] <unholycrab> hmm i have an idea for my use case
[19:35:30] <starfly> unholycrab: good deal and good luck!
[19:46:42] <motaka2> Is mongodb good for large applications
[19:46:44] <motaka2> ?
[19:47:07] <unholycrab> motaka2: yes
[19:47:20] <motaka2> unholycrab: is it relational ?
[19:47:44] <rickibalboa> motaka2, no
[19:47:45] <unholycrab> its NoSQL
[19:48:21] <motaka2> so if I delete a parent row I have to delete the child node from a third party ?
[19:49:09] <unholycrab> there are no rows because there are no tables. mongodb stores collections of documents
[19:52:06] <txt231> In general I wonder if you guys keep the default port for your application server to change it to something else.
[19:54:24] <unholycrab> starfly: how do i specify where the split happens when enabling sharding over a key
[19:54:46] <unholycrab> say the key is a timestamp. can i pick a value where the split would happen?
[19:55:37] <starfly> unholycrab: check this out: http://docs.mongodb.org/manual/tutorial/split-chunks-in-sharded-cluster/
[19:56:02] <unholycrab> so, i can't do what im describing with a non-empty collection?
[20:00:42] <ranman> unholycrab: you can split already sharded collections just make sure you keep the chunks roughly uniform
[20:00:50] <ranman> (in size)
[20:00:56] <starfly> unholycrab: this is probably the best reference: http://docs.mongodb.org/manual/tutorial/create-chunks-in-sharded-cluster/
[20:01:03] <unholycrab> hmm maybe im oversimplifying how this works
[20:01:34] <unholycrab> i want to shard over a timestamp key... leave old documents on the existing shard, and persist new documents to a new shard
[20:02:21] <unholycrab> it would not be an even split between the two shards
[20:02:36] <starfly> unholycrab: how big are the collection(s) you want to shard? If they aren't excessively large, you might be better off spending time consider the best shard key and let automatic sharding do the work
[20:03:14] <unholycrab> 1TB
[20:03:19] <starfly> unholycrab: only two shards? Lot of work without a lot of potential benefits...
[20:03:35] <ranman> unholycrab: is it more of an archive vs active situation?
[20:03:37] <unholycrab> perhaps starfly
[20:03:46] <unholycrab> ranman: yes. the "head shard" will get the most queries
[20:04:02] <ranman> tag aware sharding is probably more what you're looking for: http://docs.mongodb.org/manual/core/tag-aware-sharding/
[20:04:48] <unholycrab> oh this is exactly it, ranman
[20:04:49] <unholycrab> sweet
[20:05:04] <ranman> sweet indeed, GL, come back if you need help :)
[20:10:08] <starfly> unholycrab: another consideration, sharding is primarily a way of scaling writes, if you're load is mostly reads, you might be better off scaling those with multiple secondaries in a replica set
[20:17:55] <unholycrab> can i modify tag ranges as easily as i can add them?
[20:19:21] <unholycrab> if i make a change to the tag ranges, is the shards rebalance according to the new assignments
[20:19:31] <unholycrab> derp; would the shards rebalance
[20:22:17] <ekristen> so um yeah this doesn’t look good — http://pastebin.com/Uug8QapP
[20:53:42] <unholycrab> "the general rule is: minimum of N servers for N shards" is this true? if i get to the point where i have 4 shards, should i convert each shard to a 5 member replica set?
[21:32:27] <joannac> ekristen: don't worry about it. it's an internal message just saying the internal queue has gotten large
[21:32:38] <joannac> (which is fine since you're index building)