[02:53:52] <JoseffB> ok so it updates the array to the root
[02:54:06] <JoseffB> so not _object and ratings are peers
[02:54:14] <JoseffB> but ratings should be under _object
[02:54:34] <JoseffB> when I try to fix the data array to include htat top array I get this error
[02:54:45] <JoseffB> Cannot apply $addToSet modifier to non-array
[02:55:09] <Determinist> If I try to pull the users being followed by 1234, this doesn't really work.
[02:56:16] <Determinist> If I have to maintain names for both the follower userID and the userID of the user being followed, that means I'll have to keep updating the names in a bunch of docs when a user changes their name.
[02:56:24] <IAD> JoseffB: please paste you doc to anywhere
[03:00:09] <JoseffB> trimmed it down to the ratings branch
[03:02:36] <IAD> Determinist: you have to keep the key values
[03:04:12] <Determinist> IAD: you mean embed the names into the followers collection? What happens if a single user is following 6000 users and that user changes his name? That means an update command for 6000 documents. Sounds like a bad recipe for scaling? Unless, of course there's no way around that...
[03:06:37] <JoseffB> I dont know why addtoset says object is a non-array
[03:06:41] <IAD> Determinist: you can use the _id field for link users (not username). it will be a little easier
[03:07:18] <Determinist> IAD: still doesn't aleviate the main bottlenecks of having to do massive updates, but what can ya do.
[03:09:55] <IAD> JoseffB: show the PHP code to update
[03:27:49] <JoseffB> thats what I used and I took the object and ratings off the data
[03:34:47] <IAD> JoseffB: may be it will be helpful: http://stackoverflow.com/questions/10420635/mongodb-php-how-to-push-item-to-array-with-a-spcecific-key
[03:35:23] <IAD> it will be more convenient with the key
[04:28:19] <JoseffB> ok so now I can t get this to update v
[04:52:04] <hdm> i am seeing performance drops on insert for a single instance of mongod, even when writes occur to different databases, these go away when i manually restart mongodb, and dont represent as higher % of faults (but higher % of locks)
[04:52:36] <hdm> any ideas why an instance when a new db every ~6 hours would spiral down on performance, with similar contents for each db, and inserts only into the latest
[04:53:36] <hdm> performance drops from 12k to 5k to 3k and now 1k per second over 48 hours, but across 7+ dbs, with only the latest being inserted
[04:54:03] <hdm> vsize/mapped shows ALL dbs accounted for, res is harder to tell
[05:08:53] <neiz> anyone famliar with the C# driver? Been trying to retrieve the only document in my collection (.FindOne()), but I cannot seem to coerce it into working
[06:54:28] <IAD> ali__: pretend to be the slave, or to have a central point for communication with the base
[08:10:47] <socke> hey dudes, im migrating from rdbms to mongo, i have a user system and a gps tracking system that tracks the users (which are cars). when designing a mongo db, should the user system be a part of the tracking system?
[08:11:46] <ron> socke: when moving to nosql dbs, your data structure should almost always conform to the way you want to query your data rather than how you want to save it.
[08:14:30] <socke> the thing is, i need to do user authentication first and then start track the user. at later point i'd like to see the path the user took on a given time frame
[10:21:57] <samurai2> hi there, which one is faster : 1. filtering before map/reduce or 2. filtering inside map function? thanks
[10:23:53] <kali> samurai2: make sure you have a look at the aggregation framework too. anything you can do with it will be more efficient than map/reduce
[10:25:31] <samurai2> kali : but aggregation framework has limitation of only 16 MB on it's result right?
[10:26:24] <kali> samurai2: sure. but if you're generating a result bigger than 16MB, chances are you're abusing mongodb
[10:26:46] <kali> samurai2: mongo is designed for fast and small queries
[10:27:51] <samurai2> kali : and if I need to also filtering based on geospatial indexing it won't work as well right?
[10:28:23] <kali> samurai2: i'm not clear on what work and does not on geospatial indexing, i'm not using it
[10:28:41] <NodeX> you can't aggregate geospatial (yet)
[10:28:52] <NodeX> it's coming in the future apparently
[10:30:27] <samurai2> I'm hoping it can directly output the result into a certain collection and also we can decide the output key, not only key and value. :)
[10:30:47] <samurai2> and without output size limitation
[11:22:23] <topriddy> got this error from a java/morphia app. out of memory. mongodb top suspects http://pastebin.com/mFrgnWJz
[11:22:31] <topriddy> hoping someone is familiar with same
[11:24:08] <kali> topriddy: a quick fix is to increase the heap size with -Xmx
[11:25:14] <kali> topriddy: but you need probably need to get some insight about what's going on in your app too
[11:26:12] <kali> topriddy: try -verbosegc, to start, but then you may need something more invasive (try yourkit profiler for instance, there is a trial)
[11:27:12] <kali> topriddy: the fact mongodb appears so much in the stack is not that suspect, a database driver is likely to allocate lots of things
[11:28:21] <topriddy> kali: please do you use mongodb/morphia/java stacj?
[11:29:07] <topriddy> kali: how do you skip morphia??
[11:29:25] <kali> topriddy: ... i just call the java driver
[11:30:51] <topriddy> kali: hmm...am suspecting maybe the jvm is simply the problem. it probably needs more heapspace.
[11:31:44] <kali> topriddy: you can try and up Xmx, but if the problem does not go away, you'll have to profile gc and memory usage
[11:32:39] <topriddy> kali: experience with profiling gc/memory usage low.
[11:33:18] <kali> topriddy: well, unfortunately, this is a the real world of the jvm in production :)
[11:34:29] <topriddy> kali: thanks man. you have been helpful. i sure hope mongodb not good enough for production is not the culprit though
[11:36:28] <kali> topriddy: this is memory consumption in the jvm, not in mongodb. i'm pretty sure there is no hidden dragons in the mongo java driver
[11:38:17] <kali> topriddy: usually this is due to bad application code (like overenthusiastic caching, leading to reference/memory leak) or heap size not big enough for the jvm working set
[12:14:14] <topriddy> kali: how do you even cache explicitly? i'm not doing any caching or read about it in mongo yet
[12:34:20] <fatninja> I have the var a = "string"; and I want to search all fields that contain a, so I should do something like db.users.find({name: /a/}); But we know that doesn't work, (using server side javascript)
[12:37:06] <NodeX> 2. I am not fully sure that mongo reads global variables in it's query syntax
[12:39:01] <fatninja> NodeX, I use a driver to communicate to the db. But I think the solution is this: var regex = new RegExp(a, "g"); and {name: regex}
[12:41:35] <NodeX> perhaps, I dont do that sort of thing on the shell so I couldn't comment
[13:28:42] <NodeX> 'Apple tried to argue that it would take at least 14 days to put a corrective statement on the site – a claim that one judge said he "cannot believe".'
[13:29:17] <NodeX> Apple = 10 year old child who's been told off and is now sulking
[13:45:39] <lapdis> Apple = Geniouses that is gaming the system
[13:46:11] <robo> hello: right now we're using replica sets and I'm seeing heavy I/O across all are replica set nodes because of the overhead of replication itself I believe. I'm trying to talk our DBA's into going with shards but they are a bit nervous because of the testing. Anyone have any gems of wisdom to drop on this?
[13:46:38] <lapdis> sry, I believe this channel is dedicated to bashing apple
[13:47:14] <NodeX> or to put another way to put it is.... Apple are overpriced manufacturors of equipment that use open source cores and close them off for their own personal gain
[13:47:45] <robo> yeah, that's a very profitable business model these days
[13:48:07] <NodeX> then they whine when they don't get their own way
[13:58:44] <skot> looks like normal and expected behavior, and disk io.
[13:59:00] <skot> I'd suggest throttling your import if you want it to tax your systems less.
[13:59:09] <robo> so they have to run these to keep our data fresh. Problem is with replica sets is that it has to replicate to all nodes. I figure that if you use shards it will cut down on the overall i/o used
[13:59:33] <skot> it will redistribute it, but the same data needs to be written and repliced
[13:59:52] <skot> You can also just add more/faster disks.
[14:00:23] <skot> I wouldn't suggest sharding for this issue, just making the process less impactful or adding more resources to the nodes.
[14:00:32] <robo> the backend data disks are fiber channel luns to EMC VNX
[14:00:53] <robo> probably can't get much faster unless we go with DAS
[14:01:25] <robo> skot, what do you think of sharding overall? I talked to a few people that hit some major bugs with it
[14:01:45] <robo> i think it's a must-do for us because it's going to be hard to scale with replica sets as we put more traffic on it
[14:01:47] <skot> It has its uses but it depends what your problems are.
[14:02:25] <skot> your set doesn't seem to have traffic patterns where a single instance can't handle the load, does it?
[14:02:26] <robo> okay, so first hit problems then think of sharding
[14:02:44] <robo> nope. But we only have about 20% of our traffic going to it right now (slowly moving off oracle.)
[16:24:29] <gshipley> Anyone know if 2.2.1 is just a bugfix release with the 3-4 issues listed in the change doc? I am not sure I am reading it correctly?
[16:41:43] <_m> According to their server's JIRA: Odd version numbers are unstable dev branches (1.1, 1.3, 1.5) Even numbers are stable branches and only get bug fixes, etc... (1.2, 1.4, 1.6)
[17:32:41] <wereHamster> I basically have: Game = { players: [...]}, each player has a score, and I want to rank games by the highest score any player has reached in it.
[17:33:17] <mikejw> on second thoughts I guess it doesn't really make sense retrieving objects from the cursor more than once :)
[17:33:19] <wereHamster> or should I create a field on the Game (maxScoreByAnyPlayer) and sort by that?
[19:37:05] <andrecolin> have a setup running auth, i am using a simple ruby script, i can login to the admin db, change to my test db, can query the collections and i see the system.users collection, how can i access the user names in that collection using my code
[19:48:38] <_m> Is my_db really a hash? How did you create the connection? What are the actual errors? Pastie/gist so we have enough context to actually help.
[19:49:48] <bricker> Hello - at /var/lib/mongo I have a bunch of files of databasename.1, databasename.2, etc. Are these backups (i.e., safe to delete the older ones)?
[19:52:23] <crudson> bricker: that's your data, it gets split up into files of increasing size, maxing at 2gb(ish), but also preallocated before you hit the limit
[19:57:08] <andrecolin> crudson: my test box just went south, need to reboot, will be back with status
[20:12:21] <mrichman> Where can I find a sample database?
[20:14:55] <bricker> Forgive me, I'm new to mongodb and am afraid of deleting something that I didn't mean to. Will this query remove any entries with the timestamp before October 1? db.stream_minutes_events.remove("t" : { $lt: ISODate("2012-10-01T00:00:00.000Z") })
[20:15:26] <bricker> Forgive me, I'm new to mongodb and am afraid of deleting something that I didn't mean to. Will this query remove any entries with the timestamp before October 1? db.stream_minutes_events.remove({ "t" : { $lt: ISODate("2012-10-01T00:00:00.000Z") } })
[20:19:32] <naquad> also what about this: http://pastebin.com/FD3xe6Jt - true or false?
[20:20:21] <hdm> its someone's experience, everyone has their own opinion, and until a few months ago i had unrelated headaches that drove me crazy
[20:20:44] <hdm> but mongo 2.2 has been a huge improvement overall and msot of the issues i ran into are now just a matter of allocation/sizing up front
[20:20:46] <crudson> that's also a pretty old blog that has done the rounds a lot
[20:21:55] <hdm> the safe writes thing is overblown, you can enable it in your driver, or not, it depends if you need confirmation of every write, some uses cases dont
[20:22:25] <hdm> the sharding stuff is still suboptimal imo, ive seen weird/bad balancing, and adding/removing shards is a pain, especially compared to something like elastic
[20:23:47] <hdm> fwiw, i run a ~1.5 billion record / 3.5Tb dataset for a hobby project, my roadblocks tend to be things like concurrent cpu use since im focusing on single/huge nodes, or slow disk io once indexes go over ram
[20:24:25] <mrichman> that sounds like quite a hobby project -- what is it?
[20:24:27] <hdm> using mongo properly with horizontal scaling/replica sets/sharding would work better if i could afford to
[20:24:54] <hdm> mrichman: scanning the internet for ~6 months or so
[20:25:31] <hdm> https://speakerdeck.com/hdm/derbycon-2012-the-wild-west <- not much there about the mongo setup, but thats the source of it
[20:30:55] <mrichman> are there any sample databases out there to speak off, something like Northwind in SQL Server? Just need to sink my teeth into something real
[20:32:42] <NodeX> one main point of nosql style data stores is that it's your choice of data
[20:33:42] <mrichman> understood…but say i wanted to play with map reduce on a large dataset…kinda hard unless i contrive the data myself….seems like there should be some archetypal examples out there
[20:34:00] <hdm> mrichman: pretty easy to load up things like us census data
[20:34:13] <NodeX> any csv / json doc will load easy
[20:51:20] <hdm> yeah, changes depending on what set, im reloading it to a differetn schema on one of the boxes almost all the time
[20:51:25] <TecnoBrat> are you running multiple mongod instances etc .. or are you using VMs?
[20:52:03] <hdm> VMs failed miserably (kvm and vmware), weird storage issues, multiple mongods was ok, but sharding fell on its face to the point it wouldnt scale
[20:52:30] <hdm> the primary shard ended up with 3 x the data of the others and the disk i/o was maxed, so it couldnt slough it off and handle incoming inserts fast enough
[20:53:09] <hdm> even with only _id indexes, the read overhead got crazy, at the time drives were raid-0 x 4, so decent number of spindles
[20:53:45] <hdm> the 1T ssds (raid-0 as well) get 1000MB r/w and 25,000 IOPS, but only go to 1T, which doesnt work so well for the current size
[20:56:45] <hdm> its reloading the data still, but going way faster, better than mongo's normal downward spiral as the index goes beyond ram, your insert performance dies
[21:14:30] <Guest46986> Following this information: http://docs.mongodb.org/manual/tutorial/deploy-replica-set/
[21:14:30] <Guest46986> I've tried time and time again. And all I can get is: exception: need most members up to reconfigure, not ok.
[21:14:30] <Guest46986> All three instances are up, followed the documentation just fine.
[21:14:30] <Guest46986> I was able to run rs.initiate() once, looked like it worked, so I did it again for the other instance,
[21:14:30] <Guest46986> Of course I did some looking around, I noticed mention of a "\etc\hosts" <--- The heck is that? Do I need it? Is there something else I'm supposed to review that I might have missed?
[21:14:30] <Guest46986> Running Windows Server 2008 R2 64-bit
[21:22:46] <hdm> ^ example of how to test sharding/replicas/config locally
[21:24:10] <Guest46986> With or without port yield the same error.
[21:24:29] <Guest46986> exception: need most members up to reconfig...
[21:28:16] <Guest46986> If anyone is available to help me solve my issue, I've posted it here as well: https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/QTqV1G3LUGU
[21:34:26] <Guest46986> Following this information: http://docs.mongodb.org/manual/tutorial/deploy-replica-set/
[21:34:26] <Guest46986> I've tried time and time again. And all I can get is: exception: need most members up to reconfigure, not ok.
[21:34:26] <Guest46986> All three instances are up, followed the documentation just fine. What's the deal?
[21:34:26] <Guest46986> I was able to run rs.initiate() once, looked like it worked, so I did it again for the other instance,
[21:34:27] <Guest46986> Of course I did some looking around, I noticed mention of a "\etc\hosts" <--- The heck is that? Do I need it? If so why didn't I come across it
[21:34:27] <Guest46986> in the installation or mongo setup documentation? Is there something else I'm supposed to read that I might have missed?
[21:34:28] <Guest46986> Running Windows Server 2008 R2 64-bit
[21:35:15] <cyberfart> I have a quick question: are "$in" queries considered equality test or range query? I'm trying to work out the better compound_index ordering
[21:39:49] <TecnoBrat> cyberfart: its not a range ... its an array. You are finding any records which have a value that exists in the array
[21:40:09] <TecnoBrat> "$in: [1, 4]" means that the field is either a 1 or a 4
[21:54:40] <Guest46986> Following this information: http://docs.mongodb.org/manual/tutorial/deploy-replica-set/
[21:54:41] <Guest46986> I've tried time and time again. And all I can get is: exception: need most members up to reconfigure, not ok.
[21:54:41] <Guest46986> All three instances are up, followed the documentation just fine. What's the deal?
[21:54:41] <Guest46986> I was able to run rs.initiate() once, looked like it worked, so I did it again for the other instance,
[21:54:41] <Guest46986> Of course I did some looking around, I noticed mention of a "\etc\hosts" <--- The heck is that? Do I need it? If so why didn't I come across it
[21:54:41] <Guest46986> in the installation or mongo setup documentation? Is there something else I'm supposed to read that I might have missed?
[21:54:42] <Guest46986> Running Windows Server 2008 R2 64-bit
[21:55:39] <raja> how can i do a query where i don't want a unique on a field but a limit.. for example.. i want to query all stories but limit it to 2 per userid
[22:04:21] <Guest46986> Had no idea you where talking to me.
[22:04:40] <Guest46986> Because I clearly stated, I don't have those directories in my post.
[22:04:43] <hdm> raja: ^ add a {{ '$limit' => 2 }} after $project 'ing your unique key
[22:05:04] <raja> hdm: cool thanks for the example
[22:07:01] <Guest46986> Look I'm sorry I upset you, was not my intention. I just figured you where talking to someone else. Don't see why you would being up /etc/hosts is you where.
[22:10:40] <Guest46986> cat /etc/hosts returns: etc is not defined.
[22:11:50] <Guest46986> I do not understand where I should put these directories or configure them. THey where not covered in the documentation I followed.
[22:13:12] <LouisT> Guest46986: this is on a windows server?
[23:18:42] <jrdn> i was just wondering if i should build increment functionality in my domain model persistence manager since the update needs the domain model
[23:19:12] <Zelest> somewhat odd question; a server with only ~10MB/s write-speed, is that usable for mongodb?
[23:19:16] <Zelest> or should I find another host?
[23:19:56] <jrdn> if you have bad IO and high traffic, you'll cry
[23:19:59] <ron> jrdn: using the $inc is an atomic operation. using the first option is not. in a multi-threaded environment, that matters.
[23:20:13] <jrdn> and @ron, pings don't live on the moon.. your mom is on earth
[23:27:39] <Raynos> What should the sort Object parameter look like in mapReduce
[23:27:48] <Raynos> I know sort arrays look like [[key, direction]]
[23:27:53] <Raynos> does it want { key: direction } ?
[23:30:19] <bricker> When I delete entries from a collection, when does that data get removed from disk?
[23:39:39] <wereHamster> bricker: at an unspecified time later.
[23:42:23] <bricker> wereHamster: Thanks, just wanted to make sure I didn't have to do it manually or something :)
[23:44:26] <wereHamster> that unspecified time may also be never. Why do you care?
[23:45:01] <bricker> wereHamster: Oh... because we have some data that is taking up a huge amount of space on our server, and we're just trying to clean it up
[23:45:27] <wereHamster> you may want to try db.repair()
[23:45:39] <bricker> wereHamster: old data that we no longer care about (stats collected from Cube)
[23:45:45] <bricker> wereHamster: I'll look at it, thanks again
[23:48:46] <wereHamster> ah, cube. Do you still use it?
[23:55:30] <bricker> wereHamster: yes but we only need stats month-to-month