PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 14th of January, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:03:59] <harttho> Kind of a silly question, but what's the standard way to name Databases/collections
[00:04:06] <harttho> databaseName vs. database_name
[00:04:10] <harttho> collectionName vs collection_name
[00:06:12] <lexi2> wanted to know with tag aware sharding if i update a doc thats using tag aware sharding and the update causes it to move to another tag or range. whats the process that taks place
[00:06:18] <Boomtime> harttho: that is entirely subjective
[00:06:24] <lexi2> harttho: its totally upto you
[00:06:41] <lexi2> harttho: name them based on your program lanauge stantrded is what i recommend
[00:07:34] <Boomtime> lexi2: tag aware is a distribution of the shard key, and the shard key for a document cannot change
[00:08:03] <lexi2> okay cool thats fine i will not try to get creative or fancy
[00:08:10] <lexi2> Boomtime: ^
[00:08:36] <Boomtime> you can try, but any update that attempts to change the shard key of a document will be rejected
[00:09:27] <Boomtime> if you change the tag ranges themselves though i believe that it just triggers regular migrations.. but i'm not certain actually
[05:13:35] <speaker1234> any suggestions on how to detect duplicate records? The only thing I can think of is add an sha1 of all the variable fields and search on that.
[05:41:45] <Boomtime> speaker1234: can you define what you mean by duplicate records? _id must be unique across all documents, so it is theoretically impossible to have precisely duplicate records
[05:41:58] <Boomtime> (in the same collection)
[05:42:55] <speaker1234> The situation is that a customer of mine is sending out batches of records and every so often they send out a duplicate batch because their side burped
[05:43:16] <speaker1234> I have to do data normalization on what they send me so I get to handle all the fun filtering. :-)
[05:45:51] <Boomtime> ok, your problem is with the data you are receiving from some thrid-party?
[05:46:13] <Boomtime> is there some sort of key in these records? how do you know they are repeats?
[06:23:56] <speaker1234> Boomtime, sorry missed your response.
[06:24:45] <speaker1234> Boomtime, the only way I can tell if a record is a dupe is if certain fields are the same.
[06:25:09] <speaker1234> anyway, I'm off to bed
[06:25:12] <speaker1234> nite
[06:27:37] <Boomtime> speaker1234: if certain fields being the same indicate a duplicate, then make a unique index on the combination of those fields
[06:28:05] <Boomtime> mongodb would not allow the duplicate to enter the database, specifically giving you the error "duplicate key"
[06:28:24] <sijojose> Hi, I've a collection called trainings, inside that there is a filed courses:[{id:1,name:'abc',},{id:2,name:'def'},{id:3,name:'fgh'}] , how can I query for a specific course by using its Id....?
[06:33:03] <Boomtime> sijojose: db.trainings.find({courses.id:<id>},{"courses.$":1}) or something like that
[06:35:49] <sijojose> Boomtime: let me see.. thanks
[06:37:26] <sijojose> Boomtime: one more thing for inserting values this approach courses:[{id:1,name:'abc',},{id:2,name:'def'},{id:3,name:'fgh'}] correct right ..?
[06:38:09] <Boomtime> that will work
[06:38:26] <Boomtime> you are updating the whole document with that approach
[06:38:34] <Boomtime> i have to go, sorry
[06:39:06] <sijojose> Boomtime, not of course ...
[06:41:11] <sijojose> Boomtime, in training collection courses is a field.. in courses field I'm inserting values like that... in sql courses will be another table with foreign key reference to trainings table
[09:00:18] <optiz0r> Morning all, I've been banging my head against an authentication issue for a few hours. Could anyone help me with https://gist.github.com/optiz0r/f8ba0b8d382ab0884191 I can login via the shell with the user's credentials and run show collections but attempting to do the same via mongoengine gives an authorisation failure
[09:07:45] <optiz0r> looks like there was quite a lot of change relating to authentication between 2.4 and 2.6 so I should mention this is using 2.4.6
[10:54:02] <joannac> optiz0r: where's the authentication database in your test?
[10:54:54] <optiz0r> joannac: as in which database was the user created in? the pulp database
[10:56:58] <optiz0r> I also tried creating the user in the admin database, then adding it with userSource:"admin" in the pulp database; and adding it in the admin database with otherRoles:{"pulp":["readWrite","dbAdmin"]}. Both resulted in the same error message as when the user was defined in the pulp database directly
[11:00:51] <joannac> can you see the connection from your test program authing successfully in the mongod log?
[11:02:41] <Cygnos> juwimm
[11:03:43] <optiz0r> joannac: I see the connection but not an auth attempt logged (successful or otherwise) https://gist.github.com/optiz0r/f8ba0b8d382ab0884191#file-mongodb-log Do I need to bump up the logging verbosity to see that?
[11:06:32] <joannac> optiz0r: I thought it would show up at default log level
[11:08:39] <joannac> yup just confirmed, should show up at default
[11:09:21] <joannac> where's the actual start of the connection?
[11:09:38] <joannac> is that all of the log entries for that connection?
[11:26:59] <optiz0r> joannac: sorry had to pop away from the desk for a few moments. you're right there was an extra line I omitted (there are replication heartbeats every few minutes so the log is somewhat noisy. I'm slightly confused that it's attempted authentication as __system with a key. I do have a keyfile for replication, but I'm explicitly attempting user/password authentication from the code
[11:27:13] <optiz0r> https://gist.github.com/optiz0r/f8ba0b8d382ab0884191#file-mongodb-log updated with the extra lines
[11:27:47] <optiz0r> and indeed that the authenticate db is local, rather than pulp
[11:36:41] <optiz0r> ok ignore that, I was grepping on the connection id and pulled in a line from another day by mistake. Even bumping up verbosity to 4 I don't see an authenticate line from my test script
[11:37:16] <joannac> right. well if it's not authing, that explains why it's not working
[11:39:51] <joannac> optiz0r: you'll need to get your program to actually auth :)
[11:40:14] <optiz0r> joannac: indeed, perhaps a bug in mongoengine then. thanks for your help :)
[11:41:02] <joannac> np
[11:49:44] <StephenLynx> oh, an update on apt's mongo repo.
[11:54:20] <winem_> StephenLynx, to a version > 2.6?
[11:54:37] <StephenLynx> 2.6.7
[11:54:42] <winem_> great!
[11:55:07] <winem_> and I told my colleague to download the tar.gz from the website because the apt repos only had 2.4.x this morning
[11:55:10] <winem_> thank you :)
[12:22:25] <vinnie_is_in_da_> guys
[12:23:16] <vinnie_is_in_da_> One of my slave broke few days ago and I didnt realize it - I switched it back on today
[12:23:57] <vinnie_is_in_da_> I was expecting the slave to either do a full restore or big partial
[12:24:27] <vinnie_is_in_da_> but instead I got access to the non updated db
[12:24:34] <vinnie_is_in_da_> (from few days ago)
[12:25:00] <vinnie_is_in_da_> any clue ? I am confused
[12:48:04] <StephenLynx> and yum repository also updated
[12:48:08] <StephenLynx> to 2.6.7
[13:41:34] <jmacdonald> As a beginner, it seems there are enough subtle administrative differeences between 2.4.9 and 2.6.x ... that I should deff go with 2.6 for updated docs. is this assumption correct? i ask cause ubuntu 14.04 has 2.4.9 by default and i have to source up a 3rd party repo to get 2.6
[13:41:53] <StephenLynx> yes, use the 3rd party repo.
[13:42:04] <jmacdonald> also, what is 10gen?
[13:42:11] <cheeser> there should be an official repo for that.
[13:42:14] <StephenLynx> afaik mongo itself owns the repo
[13:42:22] <cheeser> 10gen is the old company name for mongodb
[13:42:46] <jmacdonald> gotcha. okay cool.
[13:43:10] <jmacdonald> I don't actually use mongo, but i'm being asked to admin a dev setup of it, so i'm more wrapping my mind around roles for doing backups and whatnot.
[13:43:40] <jmacdonald> and yes, using official repo.
[14:28:38] <Folkol> Hello. Mongod just filled up my disk and crashed. The data is not important, so I would like to drop the database to free some disk space. But I can not start mongod due to no disk space for the journal... Can I remove the datafiles manually, or will that leave mongo in an inconsistent state?
[14:29:03] <cheeser> well, if you don't care about the data, just blow it away.
[14:29:10] <kali> erase everything under the dbpath diectory
[14:29:18] <Folkol> Alright. Thanks.
[14:29:50] <cheeser> UberG0Su: anything to do with Gosu the language?
[14:30:42] <UberG0Su> @cheeser: nope rather with sc:bw ;p
[14:32:11] <cheeser> i don't remember that part...
[14:36:08] <jiffe> is there a way I can start a replica member and have it skip the first operation it will try to sync?
[14:52:07] <dschneider> I tried to update to mongodb 2.6.7 (with yum) but the rpm repository seams to be broken: http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/repodata/primary.xml.gz: Metadata file does not match checksum
[14:52:46] <dschneider> Is this really a problem at mongodb.org or can I'm doing something wrong?
[14:52:48] <cheeser> dschneider: checking
[14:53:04] <StephenLynx> I just updated a couple of hours ago, probably it's on your side.
[14:53:55] <StephenLynx> http://docs.mongodb.org/manual/tutorial/install-mongodb-on-red-hat-centos-or-fedora-linux/ this is what I use.
[15:01:23] <dschneider> StephenLynx: I have the same mongodb.repo configuration as in the tutorial and It was working before. I also did a "yum clean all" before I tried to update.
[15:35:22] <catphish> i'm planning to store power consumption of a large number of electronic devices (10s of thousands) at 1 minute intervals, i have a couple of basic questions about how to organize the data
[15:36:34] <catphish> firstly, is it common to use separate collections to shard data like mine on a per-device basis, or would you normally throw them all in together into one table as i would have done with a sql database?
[15:37:18] <catphish> and secondly, given that my data set will very quickly exceed available storage, does mongo offer anything to automatically aggregate averages of old data (similar to rrd)
[15:38:45] <StephenLynx> afaik, as long as you don't use $group, a sharded deploy will not have issues with a collection that is distributed among the machines
[16:07:11] <elux> hello.. will 2.8.0 be a drop-in replacement for 2.6.6 ..?
[16:13:08] <Torkable> can anyone comment on experience with the two-phase commit pattern or any alternatives?
[16:13:26] <Torkable> best comparison I could find was
[16:13:27] <Torkable> http://edgystuff.tumblr.com/post/93523827905/how-to-implement-robust-and-scalable-transactions
[16:13:47] <Torkable> job queue seemed like only real alternative
[16:17:09] <cheeser> dschneider: any luck updating?
[17:08:33] <proteneer> my secondaries are stuck in { "$err" : "not master and slaveOk=false", "code" : 13435 }
[17:08:44] <proteneer> ie. it's not synced to the master
[17:08:52] <proteneer> but rs.status() shows that health etc. is fine
[17:10:48] <Chubbs> I am building an API where I need to append a few fields to a mongo result (in Node.JS) before sending it as JSON, however whenever I try and add a key value pair to my mongo object it does not appear. I can change the value of an existing field without issue however. Is there some way to make this object modifyable or do I need to clone the result into a new object before I can alter it?
[17:11:17] <Torkable> wut
[17:11:39] <Torkable> you're doing it wrong
[17:11:57] <Torkable> only explanation
[17:13:31] <Torkable> check your logic
[17:13:46] <StephenLynx> Chubbs I never heard about non-mutable javascript objects.
[17:13:49] <Torkable> or pastebin a snippet
[17:14:00] <Chubbs> Ahhh, no I'm an idiot, this is a mongoose issue, not mongo
[17:14:04] <Torkable> StephenLynx, check out the Immutable lib :)
[17:14:05] <StephenLynx> :v
[17:14:24] <StephenLynx> Torkable a library, not a language feature.
[17:14:49] <StephenLynx> Chubbs I really avoid stuff like mongoose
[17:14:56] <Torkable> same
[17:15:09] <StephenLynx> they can cripple your performance because of what they do behind the scenes
[17:15:16] <StephenLynx> and they add no real features.
[17:15:24] <Torkable> I never really saw the point of mongoose
[17:15:33] <StephenLynx> plus it is an additional layer of complexity
[17:15:45] <Torkable> the node mongo lib is easy to use and works good
[17:15:45] <Chubbs> StephenLynx: Its a project I've inheritied, I'm not speccing it out
[17:16:02] <Chubbs> But I would agree
[17:16:27] <StephenLynx> Torkable it is anoter travesty for people who want to shoehorn a design other than the one the creators conceived.
[17:16:39] <StephenLynx> like classes in javascript
[17:16:57] <Torkable> >.<
[17:17:19] <Torkable> classes are dumb
[17:17:38] <StephenLynx> no they are not
[17:17:42] <StephenLynx> they are very useful.
[17:17:52] <StephenLynx> but javascript was designed in a way that classes does not fit it.
[17:18:11] <StephenLynx> you might as well create a new language
[17:18:14] <Torkable> but damn, have you looked at transducers with generators
[17:18:15] <Torkable> cool stuff
[17:18:31] <StephenLynx> generators are another travesty, from the little I read, but for callbacks
[17:18:47] <Torkable> generators are cool
[17:18:47] <StephenLynx> reading about transducers
[17:19:03] <Torkable> you can do some cool stuff with cps, transducers, and generators
[17:19:11] <Torkable> csp
[17:19:28] <StephenLynx> transducers are just a library?
[17:19:41] <StephenLynx> in javascript?
[17:20:02] <Torkable> yea, there are two currently
[17:20:08] <StephenLynx> ...
[17:20:10] <StephenLynx> and they just
[17:20:13] <StephenLynx> transform data?
[17:20:18] <Torkable> a port of clojure's async.core
[17:20:30] <StephenLynx> and they made a whole lib around making something something else?
[17:20:38] <Torkable> they are another level of abstraction on map-reduce idea
[17:20:54] <Torkable> sorta
[17:20:54] <StephenLynx> it sounds like just syntax sugar.
[17:21:02] <StephenLynx> bloat
[17:21:05] <Torkable> you should read about it
[17:21:08] <StephenLynx> I am
[17:21:15] <Torkable> probably learn more than me telling you
[17:21:42] <StephenLynx> sounds really stupid so far.
[17:22:18] <StephenLynx> these transformations are very case-specific
[17:22:26] <StephenLynx> so you must define this logic either way.
[17:22:41] <StephenLynx> you are just adding a layer of complexity and a generic system to your code.
[17:23:17] <StephenLynx> and of course, then there is a sort of dialect "Immutable.Vector.prototype['@@append']"
[17:23:27] <StephenLynx> that is exclusive to the library
[17:23:36] <StephenLynx> yeah, nah. this is bullshit.
[17:25:31] <StephenLynx> if people were to consider library code part of their project, they would stop using this much bloat.
[17:50:22] <matejjjj> hello, if I have array:[1,2,3,4,5] how do I query collection for array that has [1,1,1,1,*] where * is any number?
[18:06:20] <StephenLynx> matejjjj I am not sure it is possible. have you read and experimented with $elemMatch and $in?
[18:06:52] <StephenLynx> because not only you are looking for elements in a sub-array (which can be done) but you are looking for a specific sequence.
[18:07:56] <Torkable> good get to know transducers work-through
[18:07:57] <StephenLynx> http://stackoverflow.com/questions/14769355/in-mongodb-how-to-search-in-arrays-for-sequences-and-complex-searches
[18:07:58] <Torkable> http://phuu.net/2014/08/31/csp-and-transducers.html
[18:08:25] <Torkable> can ignore the cap stuff at the top if you want
[18:10:45] <StephenLynx> that is "XY: the article". the author complains about events, but events are just syntax sugar for callbacks.
[18:10:58] <StephenLynx> so he is addressing a solution to the problem instead of the problem itself.
[18:11:25] <matejjjj> StephenLynx: didnt sucseed
[18:11:39] <StephenLynx> matejjjj yeah, read the link for SO that I pasted.
[18:11:47] <StephenLynx> people confirmed what I suspected.
[18:12:14] <matejjjj> StephenLynx: I will check
[18:13:35] <matejjjj> so If I have a collection, I need to write a script
[18:14:15] <Torkable> dude wut
[18:14:20] <StephenLynx> matejjj yeah, mongo won't do that internally, you will have to process the result in your code.
[18:14:29] <Torkable> events \= callbacks
[18:14:39] <StephenLynx> "What's the problem?
[18:14:39] <StephenLynx> There's a whole stack of ideas that combine to make channels and transducers valuable, but I'll pick just one: events are a bad primitive for data flow."
[18:14:44] <StephenLynx> they solve the same problem.
[18:15:09] <StephenLynx> you can just have a secondary function as a property to the callback.
[18:15:10] <StephenLynx> I do that.
[18:15:24] <Torkable> thats like saying case statements and ifs solve the same problem so you should never even consider case
[18:15:34] <Torkable> and never learn them
[18:15:37] <Torkable> :\
[18:15:39] <StephenLynx> they don't solve the same problems
[18:15:45] <matejjjj> collection of {array:[1,2,3]},{array:[1,2,4]},{array:[2,1,2]} and to return all objects where array has [1,2,*anyNumber] I didnt sucseed, it allways returns random positions and nubmers
[18:15:47] <StephenLynx> because switches are more fit for primitives
[18:16:15] <StephenLynx> that you can safely just do a boolean comparison instead of an equality check.
[18:16:37] <StephenLynx> with events and callbacks it is the same thing. you have a key and a function.
[18:16:49] <matejjjj> ok thanks guys, I will look more
[18:16:51] <StephenLynx> but with events you have a string and an object instead of a property in a function
[18:17:52] <StephenLynx> on switches:
[18:17:59] <StephenLynx> they have one function you don't have in if-elses
[18:18:01] <Torkable> so mongoldb's toArray stream are the same thing?
[18:18:04] <StephenLynx> the ability to ommit a break
[18:18:12] <Torkable> because you can arrive at the same result?
[18:18:25] <StephenLynx> no, it operates differently in a fundamental level.
[18:18:31] <Torkable> yes
[18:18:47] <StephenLynx> so how do callbacks operate differently than callbacks on a fundamental level?
[18:19:29] <Torkable> consuming events as a stream is different than using callbacks
[18:19:48] <StephenLynx> I'm gonna do some research and resume.
[18:19:52] <Torkable> the link I sent you is using channels however
[18:20:54] <StephenLynx> when mentioned events as stream
[18:21:03] <StephenLynx> you meant to respond to stream events?
[18:21:28] <StephenLynx> and my point is that the link does not addresses the problem, but a solution to it.
[18:21:45] <StephenLynx> and said solution is syntax sugar to callbacks.
[18:22:27] <Torkable> csp?
[18:22:33] <Torkable> you should read about cap
[18:22:35] <Torkable> csp
[18:23:19] <StephenLynx> im reading, and I'm addressing the article's part that says "What's the problem?"
[18:23:28] <StephenLynx> then it proceeds to criticize events.
[18:23:28] <matejjjj> http://stackoverflow.com/questions/27949764/query-array-particular-array-elements-in-mongodb
[18:24:07] <Torkable> it addresses pros and cons of several different ideas I believe, including callbacks, promises, and FRP
[18:25:15] <Torkable> if you don't like the idea of csp then don't use it lol
[18:25:47] <Torkable> protip: don't use Go, its all csp
[18:28:59] <StephenLynx> if you can "just not use" then it is not necessary.
[18:29:20] <StephenLynx> when a functionality has a reason to exists, you can't just "not use it". there will be a case where you will have to use it.
[18:29:58] <StephenLynx> and matejjj you already has an answer telling you it can't be done.
[18:31:02] <Torkable> do you think all solutions to async data are built on callbacks?
[18:31:46] <StephenLynx> not necessarily. but I'm yet to see them providing actual functionality
[18:31:58] <StephenLynx> instead yet another way to perform what callbacks can.
[18:37:34] <Bico_Fino> Hello. I creating a mongo replica set(3 servers). My question is, I need the same size of disks on the 3 servers?
[18:38:10] <cheeser> no but you'll be limited by the size of your smallest disk
[18:38:49] <Bico_Fino> cheeser, the arbiter need to worry about this too?
[18:39:26] <Torkable> master chief take care of the arbiter
[18:39:26] <cheeser> the arbiter is not a data bearing member so it needs almost no disk space
[18:40:39] <Bico_Fino> I can add more replica sets later right? (I'm starting with primary/secondary and arbiter)
[18:43:22] <cheeser> you can add more members, yes.
[18:43:32] <cheeser> but a node can only be part of one replica set.
[18:44:11] <Bico_Fino> what's the main difference of replica set and shards?
[18:45:25] <cheeser> each shard can be comprised of its own replica set.
[18:46:01] <cheeser> so if your cluster has 5 shards, each shard can be the primary of its replica set for a total of 15 nodes in the cluster.
[18:46:27] <cheeser> data will balance between the shards and that balancing will be replicated in each replSet
[18:50:03] <Bico_Fino> thanks cheeser
[18:51:07] <cheeser> np
[19:15:33] <bmbouter> I'm using mongo 2.4 and trying to test the authentication to a collection
[19:15:54] <bmbouter> I've set 'auth = true' in the conf file
[19:16:11] <bmbouter> I've verified that conf file is in use when I start the daemon, and I increased verbosity and started it in the foreground
[19:18:05] <bmbouter> and I have a user with several roles
[19:18:08] <bmbouter> http://fpaste.org/169702/12629591/
[19:23:09] <jiffe> is there a way I can start a replica member and have it skip the first operation it will try to sync?
[19:36:47] <Mmike> Hi, lads. When I do rs.initiate() that command returns 'ok' (assuming I didn't do syntax error or provided bogus rs.config or such). But the replicaset is initialized only later. Subsequent calls to replSetGetStatus shows that Mongo goes from throwing OperationalFailure with 'replset being intialized' text, then it is in state startup->startup2->recovering->secondary->primary.
[19:37:02] <Mmike> Is there a situation where going trough this can stop with failure?
[19:41:57] <bmbouter> wow mongodb 2.4 is difficult to configure
[19:42:00] <bmbouter> like WOW
[19:42:25] <bmbouter> I set the parameter enableLocalhostAuthBypass=0 when I start mongod
[19:42:56] <bmbouter> and yet in the verbose output I still see 'note: no users configured in admin.system.users, allowing localhost access' in the output
[19:51:37] <harttho> Do you have any users defined?
[19:54:44] <harttho> Your 'note' would suggest you don't
[19:55:57] <bmbouter> I have a user defined on the database of interest
[19:56:38] <bmbouter> http://fpaste.org/169731/26523614/
[19:56:57] <bmbouter> and yet I can use 'mongo' and then switch to the database and it lets me pass without auth because I'm localhost
[19:57:02] <harttho> You'll need it in admin.system.user
[19:57:09] <bmbouter> and where in the 2.4 docs is that?
[19:57:32] <bmbouter> what is the record format?
[19:57:35] <bmbouter> how do I add it?
[19:57:48] <harttho> http://docs.mongodb.org/manual/tutorial/add-user-administrator/#authenticate-with-full-administrative-access-via-localhost
[19:58:56] <bmbouter> those are thje 2.6 docs
[19:59:38] <bmbouter> I tried this
[19:59:40] <bmbouter> http://docs.mongodb.org/v2.4/tutorial/add-user-administrator/#authenticate-with-full-administrative-access-via-localhost
[19:59:50] <bmbouter> but that still showed the 'note' I mention above
[19:59:52] <harttho> http://docs.mongodb.org/v2.4/tutorial/add-user-administrator/
[20:00:23] <harttho> > use admin
[20:00:25] <harttho> > show users
[20:00:43] <bmbouter> that shows none
[20:02:13] <harttho> use admin
[20:02:17] <harttho> add the user/admin role
[20:02:21] <bmbouter> ok I did that
[20:02:47] <harttho> does it still show none?
[20:02:51] <bmbouter> it shows my user
[20:02:54] <bmbouter> and gives me denials now
[20:02:58] <bmbouter> so that is good
[20:03:07] <bmbouter> how is that different then what I already had?
[20:03:15] <bmbouter> I had made a user on the database of interest 'pulp_database'
[20:05:04] <harttho> admin database is pre specified by mongo
[20:05:30] <bmbouter> ok that makes sense
[20:05:35] <bmbouter> and now I have an admin user
[20:05:49] <harttho> Does it work as you would like now?
[20:06:11] <bmbouter> well I have auth now, but I need this to have certain roles on the pulp_database
[20:06:27] <bmbouter> so I can make a second user with fewer privs
[20:06:51] <bmbouter> but do I make it in the admin db like we did here, or do I add it into my specific db of interest 'pulp_database'
[20:06:59] <harttho> I'd have to brush up, but I think if you have the second user within the pulp_database, you can restrict things within that database under the umbrella of the admin user
[20:07:10] <bmbouter> I was putting it at db.system.users.find( ) where db is pulp_database
[20:07:28] <bmbouter> do you always configure users in 'admin' or on the database itself 'pulp_database'
[20:07:31] <bmbouter> that is really my question
[20:08:08] <harttho> admin users go in admin, other users should be put into the pulp database would be my recommendation
[20:08:51] <bmbouter> so now I have an admin user
[20:08:58] <harttho> "Authentication requires at least one administrator user in the admin database. You can create the user before enabling authentication or after enabling authentication."
[20:09:07] <bmbouter> ok so I've got that now
[20:09:27] <harttho> Now, with a little research, your first method should work fine
[20:09:37] <harttho> Having separate non-admin users specified within your pulpdb
[20:10:01] <harttho> admin will have access to all
[20:10:07] <harttho> the other users will have whatever you specify
[20:11:03] <bmbouter> hmmm ok
[20:11:08] <bmbouter> I will continue to work with this
[20:11:10] <bmbouter> thanks
[21:18:14] <jayjo> If I'm running a mongo daemon on my server to hold my data, what is the best way to ensure that data's integrity? Do I copy it to a local machine every night with a cron job or spin up a separate server for that same task?
[21:22:07] <joshua> I think that depends on how important the data is and what your resources are. You can do a database dump during off peak hours if you have a period where it won't interfere, you can run a replica set and use one of those nodes to do a filesystem snapshot, you can snapshot the filesystem where you have it now.
[21:30:03] <cheeser> jayjo: use a replica set and mms backup
[21:33:55] <joshua> Yeah if you can leverage MMS, go for it. Saves having to figure out all the stuff on your own
[21:59:39] <theRoUS> is there a way to apply .distinct() to a .find() resultset? i.e., i want to find the distinct values for a field within a subset of the collection rather than the whole collection
[22:01:32] <theRoUS> derrr, never mind, didn't read far enough
[22:16:44] <FunnyLookinHat> If I have a bunch of items in a collection with the same indexed key - is there a fast way to update each to have a new value for a different key rather than looping each record?
[22:17:44] <FunnyLookinHat> Ah I just wasn't googling right: http://docs.mongodb.org/manual/reference/method/db.collection.update/
[22:20:10] <catphish> i am looking to store large quantities of time-based numerical data, too much to reasonably store, so i'd like to store hourly, daily averages automatically and delete older data, is this easy to achieve?
[22:21:25] <kexmex> hey guys, i've found records with a missing field, and the code didn't change
[22:21:35] <kexmex> i looked through it, and besides save(), i see only updates with $set on specific fields
[22:21:42] <Torkable> catphish, run a chron job
[22:21:45] <kexmex> s/records/documents
[22:22:06] <catphish> Torkable: what would it do?
[22:22:28] <catphish> i know i can set a ttl on records to auto-delete, but i'm not sure how to handle the automatic aggregation
[22:22:30] <Torkable> catphish, run your aggregation query and purge old data
[22:23:33] <catphish> it's the aggregation query i'm unsure about, given the large quantity of data, i'm not sure what it should look like
[22:23:58] <Torkable> ???
[22:24:21] <catphish> let me explain better
[22:24:41] <catphish> the data is quite simple {"device":1, "timestamp": "2015-01-01 00:00:00", "value":100}
[22:25:07] <catphish> this will be stored at approx 30 second intervals for many thousand values of "device"
[22:25:22] <catphish> i'd like to keep this data for a limited period of time (that part is easy)
[22:25:49] <catphish> but i'd also like to store an average of the data for each hour (obviously kept for longer) and each day (kept indefinitely)
[22:25:57] <catphish> (per-"device")
[22:26:28] <catphish> so my main question is how to build the command to build the hourly / daily data
[22:27:34] <Torkable> create a small program in the language of your choice that runs the aggregation query
[22:27:45] <Torkable> set up chron job that runs the script
[22:27:47] <Torkable> ???
[22:27:49] <Torkable> profit
[22:28:10] <Torkable> probably wanna create a compound index on the fields it will be matching on
[22:28:40] <catphish> that also makes sense
[22:28:59] <catphish> i'm just trying to get my head around what that aggregation query will look like
[22:29:08] <catphish> ie what it will match, and how it will store the results
[22:29:28] <Torkable> well I'm not sure what you want to know so I can't build the query for you
[22:29:51] <Torkable> but you could store the result object in a new collection so you have a log of the data through the day
[22:29:57] <joannac> kexmex: I presume the (unstated) question is "how did this happen"?
[22:30:02] <joshua> Please store the date as a date type, just because it bugs me when its not.
[22:30:03] <kexmex> yeh
[22:30:28] <joannac> kexmex: is the field meant to be there when inserted, or added later?
[22:30:44] <joannac> do you have profiling on, or is this a replica set? do you have verbose logging?
[22:30:57] <joannac> how many documents have missing fields?
[22:31:00] <kexmex> when inserted
[22:31:02] <kexmex> only 4 docs
[22:31:06] <kexmex> out of like a lot
[22:31:16] <kexmex> out of 79k
[22:31:28] <joannac> do you use the default objectid, or have any idea hwen they were inserted?
[22:31:31] <kexmex> and the value inserted is always DateTime.UtcNow
[22:31:38] <kexmex> yea
[22:31:42] <kexmex> i got timestamp from objectId
[22:31:44] <kexmex> now looking at logs
[22:31:46] <kexmex> to get clues
[22:32:15] <kexmex> all fields of the object are there, the ones that get inserted
[22:32:18] <kexmex> very weird
[22:32:27] <kexmex> although maybe i should check for documents missing those other fields as well
[22:32:35] <catphish> Torkable: thanks for your help by the way, i'm afraid i'm quite new to mongo so i hope i'm explaining myself clearly
[22:32:59] <joannac> kexmex: so the field is there, but no value?
[22:32:59] <catphish> as said, my input data is {"device":1, "timestamp": "2015-01-01 00:00:00", "value":100}, for many values of timestamp and device, and i will delete data in this collection older than approx 72 hours, i then want hourly averages for each value of device (i assume in another collection), however it is probably insane to completely regenerate all the hourly averages every time
[22:33:05] <kexmex> no, its not there
[22:33:05] <kexmex> at all
[22:33:13] <kexmex> i mean like, other fields from the insert are there
[22:33:19] <joannac> okay
[22:33:23] <kexmex> but maybe other fields that are present in this bad document, are missing elsewhere
[22:33:25] <kexmex> i'll just a few
[22:33:28] <kexmex> i'll check*
[22:33:43] <joannac> if you have a replicaset you can check the oplog too
[22:34:10] <Torkable> catphish, that's fine
[22:34:11] <kexmex> nop, no replicaset
[22:34:51] <Torkable> catphish, read the docs on aggregation queries and compound indexes
[22:35:28] <Torkable> may as well learn how to do it
[22:36:13] <kexmex> joannac: could it be that an update makes it to server before the insert? :)
[22:36:24] <kexmex> although the update is not touching that particular field
[22:36:27] <catphish> Torkable: thanks, am i right in thinking i can just overwrite data in my "hourly_averages" collection at regular intervals, looking only at data from he main table from the previous 120 minutes, running the query every 30 minutes, to ensure everything is correct?
[22:36:40] <catphish> (if that makes sense)
[22:37:13] <joshua> You can generate another date for when the average date was added, and put another TTL on it
[22:37:54] <Torkable> yea, and upsert on it so it will update if it exists and create if not
[22:38:47] <kexmex> hey joannac
[22:38:58] <kexmex> looks like none of the fields from INSERT are there
[22:39:14] <kexmex> basically insert didnt work
[22:39:21] <kexmex> the fields are there, are from an Upsert
[22:43:46] <joannac> kexmex: that sounds like your update is overwriting
[22:43:56] <kexmex> how
[22:44:03] <kexmex> update has $set only
[22:44:14] <joannac> pastebin the update
[22:45:23] <kexmex> http://pastie.org/private/8t5dx7stvillwq7fjjvxrq
[22:45:44] <kexmex> maybe the thread that was supposed to do the insert got aborted
[22:45:48] <kexmex> before the update went out over the wire
[22:46:25] <kexmex> i am guessing i should have a writer thread for this and i should just queue commands and have the writer thread pick them off one by one in proper order
[22:46:47] <joannac> "WriteConcern.Unacknowledged"?
[22:47:04] <kexmex> joannac: YEH
[22:47:05] <kexmex> yeh*
[22:47:16] <kexmex> not critical
[22:48:43] <joannac> writeconcern.unacknowleged means "i don't care if my write succeeded or not"
[22:48:56] <kexmex> yea
[22:50:08] <kexmex> i was just making sure that it's not some kind of corruption
[22:52:11] <kexmex> wait
[22:52:20] <kexmex> doesn't Mongo driver have it's own queue ?
[22:52:47] <kexmex> so when i do collection.Save(), i am handing it over, and if the thread that called .Save() dies, that shouldn;t prevent the insert from going through, should it?