[05:35:48] <xshk> Has anybody used Puppet to provosion MongoDB? I am trying to deploy a standard mongo cluster of 1 primary, 3 secondary and 1 arbiter, but i'm having some problems when even just trying to get the primary working
[05:36:57] <joannac> have you intiated the replica set?
[05:40:54] <xshk> i thought that the puppetlabs mongodb module would do that
[06:12:11] <KekSi> 'morning -- where does the $external db reside? along with the admin db on the configservers?
[06:13:42] <KekSi> it shows up in sh.status() but not in show dbs
[06:14:56] <KekSi> but it says partitioned: false and primary: rs0 (first replicaset) so i'm sort of confused how one would be able to do certificate login on the other replicasets
[06:48:39] <KekSi> i'm using it and it works fine but i'm still curious why its not either partitioned or located on the configservers
[06:49:45] <KekSi> my cluster isn't in production yet and i wanted to know whether i'll have to add users on all shards manually for it to work properly
[09:02:51] <krion> i try to understand the impact of config server on mongos
[09:03:23] <krion> looks like i had preproduction server configured on an arbiter for quite a long time
[09:03:41] <krion> the thing is the arbiter is a production server :D
[09:04:06] <krion> since it's an arbiter, i'm not sur but i guess the impact is null if i fix thix
[09:22:17] <krion> i don't get how i was able to add an arbiter with configsrv different than the rest of the replicaset...
[11:24:35] <JoshK20> Hey. I have a document that looks like this: http://hastebin.com/agojuwigap.coffee. I want to insert and remove from "package". How would I do this using the PHP driver?
[11:31:34] <pamp> anyone here with java driver experience?
[11:32:13] <pamp> I want to make an unordered Bulk operation
[11:32:29] <pamp> I get the collection like this: DBCollection coll = db.getCollection("testCollection");
[11:33:04] <pamp> than I create a bulk like this BulkWriteOperation builder = coll.initializeOrderedBulkOperation();
[11:33:28] <pamp> But I get this error : The method initializeUnorderedBulkOperation() is undefined for the type DBCollection
[11:33:46] <pamp> this is the way i see in the docs
[11:37:34] <fontanon> Hi everybody, I've set a tag aware sharding cluster. After changing the split for the sharding keys, data started to flow from shard1 to shard2 and shard3 but there is a lot of space that is been used during the process.
[11:38:18] <fontanon> I see in my /data a directory called moveChunk using a lot of space, why?
[12:04:01] <fontanon> Hi! How can I reclaim the proper free space after a chunk migration ? My primary node is still consuming the space for the migrated chunks
[12:08:46] <fontanon> Is it ok to remove the /data/moveChunk dir during a chunk migration ?
[12:10:43] <deathanchor> fontanon: read about http://docs.mongodb.org/manual/reference/command/repairDatabase/
[12:12:00] <deathanchor> fontanon: you might be better off stepping down and resyncing from another member, but that requires that your secondary takes on operations while your primary resyncs from it.
[12:12:58] <fontanon> deathanchor, the point is i'm not running replicas.
[12:15:24] <deathanchor> then repair is your only choice, I hope it's not a production db.
[12:16:09] <fontanon> deathanchor, it sais "repairDatabase requires free disk space equal to the size of your current data set plus 2 gigabytes. "
[12:16:34] <fontanon> deathanchor, I've not that free space
[12:19:24] <fontanon> deathanchor, I'm beggin for having the enough space in the primary shard to complete the chunk migration. I expect once the chunk migration is completed, mongodb will free the proper space in the primary sharding, won't it?
[12:20:10] <deathanchor> not really, it's all about the padding and the data in files
[12:20:33] <deathanchor> this gets into the gritty details which I
[12:20:59] <fontanon> deathanchor, but is it a matter of having enough space and waiting enough time for having the space free in the primary shard ?
[12:21:00] <deathanchor> I'm not sure of, but I believe it only frees up space if all the data is gone from the file.
[12:22:46] <kba> Hi, I've never really used any NoSQL databases, and I'm just trying out MongoDB now. Coming from a world of RDBMS, I can't figure out how to structure my data in Mongo.
[12:23:02] <kba> Say I have something like a blog with multiple authors. How would it make sense to structure that?
[12:23:45] <kba> In a RDBMS, I'd create a users table and an article table, and have an author column in the article table point to a user in the users table.
[12:23:54] <kba> I understand I can't really do this in MongoDB.
[12:25:24] <kba> Would i just create a collection of articles with userids and a user table?
[12:25:51] <deathanchor> kba: that's a really long conversation. There are various mongo tutorials which can help you wrap your head around mongo.
[12:26:12] <kba> I'd be happy if somebody could link me a good article, too
[12:26:35] <kba> something they've read that's actually good, I've found a ton of bad articles myself
[12:28:29] <kba> the articles for instance suggest I have an author field with the full user in, but that would cause a lot of redundant data
[12:29:10] <kba> and if suddenly the author was to change his email, that would have to be changed in both the users collection and in every single document in the articles collection written by the user
[12:39:43] <kba> I think I found what I'm looking for.
[12:50:07] <heuri> Does anybody know if it's possible to use a replica set besides a master-slave replication? I'd need one master-slave replication to avoid a bilateral connection between the servers. Thanks!
[12:51:23] <deathanchor> heuri: I'm confused by your second sentence.
[12:52:57] <heuri> As I know, each member of a replica set needs to be able to communicate with each other member of a replica set. (e.g. for election and heartbeat)
[12:53:28] <heuri> But I want, that one member has a one way connection (backup replication) only.
[12:54:10] <heuri> So that this one member only pulls the oplog.
[12:54:32] <deathanchor> do you require that it cannot communicate with the primary? if not, then just set it as hidden. if yes, then I have no idea.
[12:57:20] <heuri> So if I set it to hidden, it doesn't need the ability that my hidden can be reached by the primary or the other secondaries?
[12:58:32] <deathanchor> it does require communications both ways. I don't know how to avoid that. There might be a script which can read oplogs.
[12:59:41] <heuri> Alright, I see thank you very much! :)
[13:38:42] <lino_76> Hello, Not sure if this is the correct channel, but I'll ask away.... I registered for the mongodb for developers course, but cannot seem to find where any of the learning material is stored. Has anyone taken this course?
[13:39:12] <StephenLynx> I think joannac works for 10gen, she might now better, lino
[13:39:51] <StephenLynx> lucsx that sounds weird. But I don't think a failure with mongo could cause such a precise data wipe.
[13:40:08] <StephenLynx> where and how are you hosting your db?
[13:40:52] <lucsx> I checked logins, noone except one, the mongod server is behind iptables
[13:41:07] <lucsx> everything is all right except mongodb
[13:41:15] <seion> How do I find a document by its BillingStreet when its structured like this. { _id: 1234, addresses: { billing: { BillingStreet: "bla bla bla" } } }
[13:51:32] <seion> StephenLynx: what about if I had a Billing and a Shipping Address and I wanted to search on both with a Street: /somestreet here/ addresses: { billing: { street: 'somestreet here' }, shipping: { street: 'somestreet here' } }
[13:54:03] <bsdhat> I'm looking for assistance compiling a cpp client from the tutorial. Can anyone assist me?
[13:55:43] <bsdhat> I'm running the command: g++ -I /usr/include mongo_cpp_driver_test.cpp, and getting the error: mongo_cpp_driver_test.cpp: In function ‘int main()’:
[13:55:44] <bsdhat> mongo_cpp_driver_test.cpp:11: error: ‘mongo::client’ has not been declared
[13:56:03] <lucsx> it went smooth like a simple upgrade StephenLynx
[13:56:26] <StephenLynx> yeah, if you don't have data to maintain, is just installing the new version.
[13:59:57] <pamp> whats the diference of BulkWrite and InsertMany
[14:00:20] <pamp> which is faster to insert millions of records?
[14:39:44] <jasondockers> Is there still a guide on setting up a test sharded cluster on four servers?
[14:41:06] <cheeser> still? the old url doesn't work?
[14:43:10] <reactive_> hi guys. I'm trying to repair my mongodb which is 503gb, and I understand that it takes an additional 503gb + 2gb extra. The repair has been running for over 6 hours, and is now about 150% the size and counting. Even the numbered files in the tmp repair folder is almost double the amount. What's happening?
[14:44:52] <reactive_> it does, but i'm running out of space, and I'm not sure how much more I need to allocate
[14:45:26] <reactive_> is there a reason why it's going over so much?
[15:22:57] <jasondockers> cheeser: if you're responding to me: no
[15:23:12] <cheeser> jasondockers: what was the url?
[15:48:41] <fontanon> Hi people,I expected fileSize of my primary shard to shrink after moving chunks to the other shards but it didn't happened. Why?
[15:48:42] <fontanon> I can't understand the relation between moving chunks and fileSize ...
[15:50:19] <cheeser> disk space is not necessarily released back to the OS
[16:07:27] <fontanon> cheeser, but I believe that, due to the chunks moved there should be a lot of deleted space in the primary shard (the chunks were moved from primary shard to the others)
[16:07:56] <StephenLynx> perhaps mongo will reuse this space instead of pre-allocating more.
[16:07:57] <fontanon> cheeser, I expected the new write operations would re-use that mongodb space reserved
[16:08:41] <fontanon> StephenLynx, the point my primary shard stills using additional space
[16:12:19] <fontanon> StephenLynx, the half of all the available chunks were moved from the primary shard to the others, and the fileSize is still growing. The chunk migration finished as the balancer is not running.
[16:56:46] <deathanchor> is there pymongo function for figuring out the primary of a set?
[18:18:14] <MacWinne_> I have a php app that I'm migrating to nodejs.. currently my php app queries mongo, buffers the entire response, and then sends the entire response.. howveer, I see with nodejs there is an option to stream the data.. is this how mongo works internally? ie, will it stream large result sets to a client if the result set does not fit into memory?
[18:18:23] <MacWinne_> but the indexes do fit in memory
[18:59:33] <Pranay> {<an arraay>: {$regex: <Some variable containing a string>, $options: "i"} } does not work. Any idea why?
[19:09:52] <Pranay> This find query {<an array>: {$regex: <Some variable containing a string>, $options: "i"} } does not work. I want to be able to search docs, containing a tags field which is an array of strings. And the string to look for is in a variable. Any help?
[22:32:21] <buzzalderaan> hi all, i'm hoping i can get some help designing a query (or updating the data model to better support such a query)
[22:36:57] <buzzalderaan> i'm given an array of transactions and i want to check if they exist in the collection based on two fields, a transaction id and a date. is there a way to get all the results at once or would i have to query per item in the array to get such a result?
[23:12:10] <f31n> hi, im totally new to mongo / nosql databases, and i just got a project where i should create a mongo db database. In my sql world it would look like this: http://pastebin.com/kJqSB0TB what would you advice me not to do, or to do totally different?
[23:13:25] <StephenLynx> if you are going to use a non-named unique ID, you might as well don't define it, just roll with mongo's _id
[23:13:37] <StephenLynx> I prefer to ignore it and just create my unique indexes.
[23:13:48] <StephenLynx> with proper names, such as "login" or "email".
[23:14:40] <StephenLynx> and it depends on how you are going to query.
[23:15:00] <StephenLynx> if you are going to use fake references, keep in mind you will not be able to join to gather data.
[23:15:28] <StephenLynx> in that case you might want to duplicate this information.
[23:16:00] <StephenLynx> which may cause issues when if you wish to chance the related entity, like the type name.
[23:16:25] <StephenLynx> you would have to update all questions that are of this type.
[23:19:48] <f31n> okay, but i could make one select after the other like select results where _id_users == xy and then from that array select question from questions where _id_questions == array
[23:32:02] <f31n> StephenLynx: so if i got that right it would better look like that: http://pastebin.com/wrh5axJw ?
[23:52:03] <svm_invictvs> So, is there a way to get a seekable stream from the GridFS library (in Java)?